Re: AbiSource, Ispell, Aspell, take 2


Subject: Re: AbiSource, Ispell, Aspell, take 2
From: Kevin Atkinson (kevinatk@home.com)
Date: Mon Mar 06 2000 - 04:17:14 CST


Havoc Pennington wrote:
>
> (list will probably bounce my mail, feel free to forward...)
>
> Kevin Atkinson <kevinatk@home.com> writes:
> > <word> can be any one of const char *, const unsigned short *, or
> > const unsigned int *. Strings of const char * are expected to use
> > iso8859-1 or some other 256 bit character set as determined by the
> > current language in use. Stings of const unsigned short * and const
> > unsigned int * are expected to be in Unicode.
> >
>
> If you want to support multiple encodings you can't really do it by
> overloading types this way; for example UTF8 (which GTK+ and GNOME
> will use) is a multibyte unicode encoding that comes in a char*, not a
> unsigned short*/int*.
>
> You will need an enum or something.
>
> However it's probably easier to just do what GTK/GNOME will do, and
> require everything to get converted to UTF8 on its way in to the
> program, then all internal APIs use UTF8.

I will still require unsigned short */int * to be in unicode but will
allow char * to be in almost any encoding as set by the config class. I
will still default char * to iso8859-1 or some other 256 bit character
set as determined by the current language in use but will allow it to be
set to UTF8.

UTF8 is NOT a nice encoding for storing things internally as it not a
constant width. Furthermore my spell checker aspell is 8 bit and will
be 8 bit for some time to come. Before telling me all about unicode
please see http://aspell.sourceforge.net/international/.

-- 
Kevin Atkinson
kevinatk@home.com
http://metalab.unc.edu/kevina/



This archive was generated by hypermail 2b25 : Mon Mar 06 2000 - 04:14:10 CST