From: F J Franklin (F.J.Franklin@sheffield.ac.uk)
Date: Sat Apr 20 2002 - 06:20:06 EDT
> wrote: > o new UTF8String class (untested)
>
> If this is part of the new unicodization to support
> full-unicode, there's some stuff we need to discuss.
Wasn't intended as such. phearbear says QNX wants to use UTF-8 whereas
Abi uses UCS-2 and I decided to write the UTF8String class to facilitate
the conversion. Strings are stored internally as UTF-8 byte sequences,
and there is a home-made iterator for accessing the string sequence by
sequence; and a fn. for converting current sequence to UCS-4.
Currently conversion to UTF-8 is only from UCS-2, but conversion from
UCS-4 would be a trivial change. (I'm assuming that UCS-2 is the first
65536 codes of UCS-4 - is this correct?)
As a string class it's not nearly as functional as the others, but it's
not really intended as a replacement.
> We need to design the system so that a string is not
> built from a series of UTF-8 (or UTF-32) characters
> directly, but a series of "composed character" which
> in turn are a series of UTF-8 characters, the first
> being the main character, the remainder being zero-
> width modifiers. We need this to support proper
> internationalization. We probably need much
> discussion first actually.
Not sure I understand this. Can you explain how to use zero-width
modifiers?
Frank
Francis James Franklin
F.J.Franklin@shef.ac.uk
"No, she really likes me. She told me I look like Britney Spears, and why
would you say that to somebody you don't like?"
--- Elle Woods
This archive was generated by hypermail 2.1.4 : Sat Apr 20 2002 - 06:21:02 EDT