Re: commit: abi: UTF8String class

From: Joaquin Cuenca Abela (cuenca@pacaterie.u-psud.fr)
Date: Sun Apr 21 2002 - 07:22:23 EDT

  • Next message: Martin Sevior: "Re: commit: abi: UTF8String class"

    ----- Original Message -----
    From: "Tomas Frydrych" <tomas@frydrych.uklinux.net>
    To: "Andrew Dunbar" <hippietrail@yahoo.com>
    Cc: <abiword-dev@abisource.com>
    Sent: Saturday, April 20, 2002 9:33 PM
    Subject: Re: commit: abi: UTF8String class

    >
    > > Andrew Dunbar <hippietrail@yahoo.com> wrote:
    >
    > > Well pretty soon we're going to need a real
    > > replacement. Dom and I are both in favour of the
    > > replacement being UTF-8 but some here seem to want
    > > UTF-32.
    >
    > UTF-8 is an encoding scheme that is intended to allow Unicode
    > communication between separate processes over 8-bit channels.
    > For that it is great, but that's about the only thing it is really good
    > for. UTF-8 processing is cumbersome, and as such it is completely
    > unsuitable format to use for the piecetable. We need a fixed with
    > encoding for that, such as the curent UCS-2, i.e., UTF-32.

    Tomas, I think that you're confusing the intent of UTF-8 with UTF-7.
    Can you explain where helps UTF-8 in the communication through 8-bit
    channels?
    You can also send UTF-32 (or any other encoding) over 8-bit channels,
    chopping each char in 4 bytes.

    The only think that keeps (broken) mail servers and such stuff that eats
    your 8th-bit of each byte from eating it, is UTF-7, and UTF-8 is absolutely
    unrelated to that stuff.

    Cheers,

    --
    Joaquin Cuenca Abela
    cuenca@pacaterie.u-psud.fr
    


    This archive was generated by hypermail 2.1.4 : Sun Apr 21 2002 - 07:19:58 EDT