Re[2]: to support CJK(Chinese Japanese Korea)


Subject: Re[2]: to support CJK(Chinese Japanese Korea)
From: hashao (hashao@telebot.com)
Date: Tue Feb 08 2000 - 06:22:35 CST


Hello Henrik,

Monday, February 07, 2000, 12:21:09 AM, you wrote:

>> These two work is very simple. Others such as spell check, print, import MS
>> Word should be modified also.

HB> Spell check is a problem. ispell (the base of the spell checker) is very 8-bit (even 7-bit) orientated. If anyone has an example of ispell used for 16bit/multibyte languages, it would be nice for
HB> me to get it to study how it's done.

Since Chinese character are more atomic like, a spelling check similar
for 8 bit symbolic character does not make much sense. Meaning can
made out of a single charactor or a phrase of several charactors.
Moreover, sometimes the means of a phrase has little relationship with
the charactors made up of it. So you cannot hardly judge if a phrase
or a sequence of charactors are correct or wrong automatically.

Beside, we will have problem to single out a phrase from a sentence
since there are no whitespace inside a Chinese sentence. Without some
lexicon analysis, some AI work, it is not easy to tell if a sentence
like 'abcd' is actually 'a bcd' or 'a b c d' or 'ab c d'...

Anyway, a temperary solution could be disable spelling check when
encounter characters in the unihan range. Otherwise, a whole sentence
long word in Chinese could confuse ispell badly. (I am not sure if this
disable thing is desirable for Japanese or Korean Languages though.)

Best regards,
 hashao mailto:hashao@telebot.com



This archive was generated by hypermail 2b25 : Thu Feb 10 2000 - 07:03:24 CST