From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Sun Apr 21 2002 - 09:59:10 EDT
--- F J Franklin <F.J.Franklin@sheffield.ac.uk>
wrote: > I think Java uses UTF-16.
I bet that older versions of Java used UCS-2 and newer
versions use UTF-16 since they're mostly compatible,
but that's just a guess.
> UTF-32 is a subset of UCS-4, I believe. My
> impression is that UTF-8 and
> UCS-4 have less to do with UNICODE than UTF-16 and
> UTF-32.
I still don't know exactly how all of them relate.
This is an annoyance of Unicode - many things at least
seem muddy to those of us on the outside.
UCS is "Universal character set", which basically
means the mapping of a number onto a character. As
such UCS-2 *should* mean the old 16-bit range of
Unicode characters and UCS-4 *should* mean the new
(up to) 32-bit range of Unicode characters. UCS does
not specify how this number is encoded into various
sequences of bits/bytes/words/etc.
UTF is "Unicode Transformation Format", which
basically
means the algorithm used to encode the UCS number of
each character into one or more units. UTF-8
encodes any UCS character index into one or more 8-bit
units. UTF-16 encodes any UCS character index into one
or more 16-bit units. UTF-32 encodes any UCS
character index one (and no more) 32-bit units.
That ought to be the definition but in practice or at
least in common usage the terms all get muddied ):
> I was using:
>
http://www.cl.cam.ac.uk/~mgk25/ucs/ISO-10646-UTF-8.html
>
> Some other links for the curious:
> http://czyborra.com/utf/
> http://www.tldp.org/HOWTO/Unicode-HOWTO-1.html
> http://www.cl.cam.ac.uk/~mgk25/unicode.html
Thanks for the pointers! Here are some I find
useful:
http://mail.nl.linux.org/linux-utf8/ (not just UTF-8)
perl6-internals-unicode@perl.org/">http://archive.develooper.com/perl6-internals-unicode@perl.org/
(old but it covers many issues we
are going to be grappling with)
http://mail.gnome.org/archives/gtk-i18n-list/ (Pango
discussion takes place here)
> Andrew, thanks for the answer. Personally I see no
> problem using UTF-8
> internally, but I don't do piecetable work so it's
> not really my call. I
> wasn't trying to preempt the decision; the new class
> is just a utility.
Neither do I. I eagerly await what specific things
Dom hopefully has to say on Monday...
Frank, I know you weren't preempting anything but I'm
glad it came up since now is the time it needs to come
up (:
Andrew Dunbar.
> Frank
>
> Francis James Franklin
> F.J.Franklin@shef.ac.uk
>
> "No, she really likes me. She told me I look like
> Britney Spears, and why
> would you say that to somebody you don't like?"
>
> --- Elle Woods
>
>
=====
http://linguaphile.sourceforge.net http://www.abisource.com
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
This archive was generated by hypermail 2.1.4 : Sun Apr 21 2002 - 10:00:17 EDT