From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Mon Apr 22 2002 - 11:00:27 EDT
--- Karl Ove Hufthammer <huftis@bigfoot.com> wrote: >
Martin Sevior <msevior@mccubbin.ph.unimelb.edu.au>
> wrote in
>
news:Pine.OSF.4.21.0204222354500.21132-100000@mccubbin.ph.unimelb.
> edu.au:
>
> > If we use UTF-32 as our internal format in the
> > piecetable will we still need two or more 32-bit
> numbers to
> > represent combining characters?
>
> Yes.
>
> > I really don't like the idea of variable length
> strings per
> > glyph
>
> Also note that a string can contain more, less *or*
> the same number
> of glyphs as the number of characters.
This is why we need to represent the text with some-
thing more like a linked-list of objects where the
top-level object represents an "on-screen character"
which can be made up of one or more "codepoints" which
in turn can be made up of one or more bytes. It may
actually be more complicated than this - I'm not sure
at this point. We might want to look at how IBM's ICU
represents "strings" http://oss.software.ibm.com/icu/
and even how Pango represents its strings internally
since it already must handle these types of problems.
Andrew Dunbar.
> --
> Karl Ove Hufthammer
=====
http://linguaphile.sourceforge.net http://www.abisource.com
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
This archive was generated by hypermail 2.1.4 : Mon Apr 22 2002 - 11:01:41 EDT