Re: commit: abi: UTF8String class

From: Martin Sevior (msevior@mccubbin.ph.unimelb.edu.au)
Date: Sun Apr 21 2002 - 10:51:54 EDT

Next message: Martin Sevior: "Next Generation Containers."

Previous message: Rui Miguel Silva Seabra: "Re: Ready for the Big Time!"
In reply to: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Next in thread: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Next in thread: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Reply: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Reply: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

> >
> > UTF-8 is great for communicating between the
> > piecetable and the widgets. I
> > think we should definately do this. What I don't
> > want is for us to store
> > our text as UTF-8 in the piecetable. We have a *LOT*
> > of code that expects
> > that every position in the piecetable corresponds to
> > an extra letter of text.
>
> How is this going to work for languages that need
> combining characters? Isn't it going to need to be
> changed anyway? Isn't now the time to do this
> re-design?

I don't understand this. Doesn't every glyph have a unique unicode code
point? If so we still have a one-to one mapping of glyph to text location.

>
> > What I think we should do is store our unicode as
> > UT_uint32 in the
> > piecetable which can then be randomly accessed the
> > same way we do things now.
>
> To randomly access what the user sees as a character
> or to randomly acces what is internally one codepoint?

OK I don't understand. Are you saying that two code points in a row map to
a different glph? If so why not just insert the code point for this glyph?

> These are not the same. But I don't know the
> piecetable either so maybe it is the right thing to
> do.
> As long as we are thinking about it.

Certainly the structure of the code makes lots of assumptions of one
PT_DocPosition, one glyph. If unicode was at all sane this should not be a
problem. Are you telling me that unicode is not sane and that certain
glyphs can only be generated if two 32 bit numbers are presented
consecutively?

Cheers

Martin

Next message: Martin Sevior: "Next Generation Containers."
Previous message: Rui Miguel Silva Seabra: "Re: Ready for the Big Time!"
In reply to: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Next in thread: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Next in thread: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Reply: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Reply: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Sun Apr 21 2002 - 10:53:02 EDT