Re: A quick clarification please.

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Mon Apr 22 2002 - 11:00:27 EDT

  • Next message: Jesper Skov: "AbiWord Weekly News #88 (2002, week 16) released"

     --- Karl Ove Hufthammer <huftis@bigfoot.com> wrote: >
    Martin Sevior <msevior@mccubbin.ph.unimelb.edu.au>
    > wrote in
    >
    news:Pine.OSF.4.21.0204222354500.21132-100000@mccubbin.ph.unimelb.
    > edu.au:
    >
    > > If we use UTF-32 as our internal format in the
    > > piecetable will we still need two or more 32-bit
    > numbers to
    > > represent combining characters?
    >
    > Yes.
    >
    > > I really don't like the idea of variable length
    > strings per
    > > glyph
    >
    > Also note that a string can contain more, less *or*
    > the same number
    > of glyphs as the number of characters.

    This is why we need to represent the text with some-
    thing more like a linked-list of objects where the
    top-level object represents an "on-screen character"
    which can be made up of one or more "codepoints" which
    in turn can be made up of one or more bytes. It may
    actually be more complicated than this - I'm not sure
    at this point. We might want to look at how IBM's ICU
    represents "strings" http://oss.software.ibm.com/icu/
    and even how Pango represents its strings internally
    since it already must handle these types of problems.

    Andrew Dunbar.

    > --
    > Karl Ove Hufthammer

    =====
    http://linguaphile.sourceforge.net http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Mon Apr 22 2002 - 11:01:41 EDT