Re: commit: abi: UTF8String class

From: Joaquin Cuenca Abela (cuenca@pacaterie.u-psud.fr)
Date: Sun Apr 21 2002 - 09:10:57 EDT

  • Next message: Andrew Dunbar: "Re: commit: abi: UTF8String class"

    Tomas wrote:
    > UTF-8 processing is cumbersome, and as such it is completely
    > unsuitable format to use for the piecetable. We need a fixed with
    > encoding for that, such as the curent UCS-2, i.e., UTF-32.

    Tomas, can you also clarify this point?
    I don't see why UTF-8 is unsuitable for the piecetable.

    One things that worries me about UTF-8 is that even if random access is not
    something very useful, I don't think that it would be trivial to change code
    that assumes random access to strings to code that uses only forward &
    backwards iterators, so UTF-32 may still be our best shot here.

    UTF-8 has another advantage over UTF-32. In the gtk & the qnx frontend, the
    format used to output text is UTF-8, so we'll not need to do any conversion
    to get text displayed. win32 uses UTF-16/UCS-2 (in funcion of the
    registry?) so anyway you need to do a conversion from UTF-8 or UCS-4 to
    UCS-2.

    Cheers,

    --
    Joaquin Cuenca Abela
    cuenca@pacaterie.u-psud.fr
    


    This archive was generated by hypermail 2.1.4 : Sun Apr 21 2002 - 09:08:26 EDT