Re: from AbiWord to AbiSuite [was Re: A new draw on XP refactoring


Subject: Re: from AbiWord to AbiSuite [was Re: A new draw on XP refactoring
From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Sun Feb 03 2002 - 10:35:19 CST


 --- Mike Nordell <tamlin@algonet.se> wrote: > Martin
Sevior wrote:
>
> > 2. 16 bit unsigned char => 32 bit unsigned char to
> allow 100% unicode
> > compliance.
>
> I take it you mean UTF-32 compliance here, meaning
> AbiWord is to only use
> UTF-32 (unicode.org) or UCS-4 (ISO/IEC
> 10646-1:2000)?

I've already discussed this with dom a couple of
times.
What we need to do is support the full 32-bit Unicode
character set but we shouldn't use UTF-32 to do it
since we'll waste vast amounts of memory space since
characters above 16-bit are very very rare. We need
to instead switch to UTF-8 internally for everything.
This is the right answer for several reasons which
have all been covered in depth on several mailing
lists
about Unicode issues which should be findable easily
using Google.

> May I ask exactly _how_ we're supposed to display
> e.g. UCS-4 on Win32? :-)

That's a different issue. Win32 can display 32-bit
Unicode characters using its UCS-2 with surrogates.
Our string classes will handle the conversion like any
other encoding conversion. Some versions of Windows
may require a registry change or dll update for 32-bit
Unicode to display. Without it strings won't get
stomped but will display the occasional "unknown
character" glyph.

> You don't happen to know of a freely available
> 32-bit TrueType font (less
> than 12 GB in size)? :-)

This is also another issue. Nothing says one document
has to use a single font for multiple languages.

Andrew Dunbar.

=====
http://linguaphile.sourceforge.net http://www.abisource.com

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com



This archive was generated by hypermail 2b25 : Sun Feb 03 2002 - 10:35:22 CST