Re: support for 32-bit Unicode


Subject: Re: support for 32-bit Unicode
From: Anthony Fok (anthony@thizlinux.com)
Date: Mon Feb 04 2002 - 04:41:43 CST


Hello all,

On Mon, Feb 04, 2002 at 10:02:40AM -0000, Tomas Frydrych wrote:
> I agree that having 32 UT_UCSChar would vaste lot of memory, and
> I would like to see a case made first why we need to support 32-bit
> Unicode.

Sorry jumping into this discussion, as I didn't know how this 32-bit
Unicode discussion got started. :-) Anyhow, with the release of
Unicode 3.1, and ISO-10646-2, many codepoints above U+10000 are now
assigned, and they can only be handled by either 32-bit Unicode or
other Unicode encoding that supports surrogates. One very real use is
Big5-HKSCS 2001, the Hong Kong Supplementary Character Set, updated for
the ISO-10646-2:2001. There are some 4000 characters in HKSCS 2001, of
which 1651 characters are now mapped in the U+2xxxx CJK Extension B
Area. In the 1999 standard, these characters were temporarily placed
in the EUDC / PUA.

While HKSCS 2001 is still quite new, the need to support Unicode beyond
the Basic Multilingual Plane (BMP, the first 16-bit plane) will be
greater and greater in order to support more minority languages in OS
and applications. :-) By the way, full support of HKSCS is a
requirement in many departments of Hong Kong Government, as many
people's or place names are only found in the HKSCS.

Just my 2 cents,

Anthony (in Hong Kong). :-)

-- 
Anthony Fok Tung-Ling
ThizLinux Laboratory   <anthony@thizlinux.com> http://www.thizlinux.com/
Debian Chinese Project <foka@debian.org>       http://www.debian.org/intl/zh/
Come visit Our Lady of Victory Camp!           http://www.olvc.ab.ca/



This archive was generated by hypermail 2b25 : Mon Feb 04 2002 - 04:33:29 CST