Re: commit: UTF-8 recognition patch (2nd attempt)


Subject: Re: commit: UTF-8 recognition patch (2nd attempt)
From: Vlad Harchev (hvv@hippo.ru)
Date: Tue Apr 10 2001 - 11:19:31 CDT


On Tue, 10 Apr 2001, Andrew Dunbar wrote:

 Hi Andrew,

> It looks like my email software trashed the formatting. Here it is
> again as an attachment. By the way I just tested this patch with the
> native
> Windows 2000 code page plain text and UTF-8 with the
> following locales:
> English, Japanese, Greek, Hungarian, Turkish,
> Chinese (China), Chinese (Hong Kong), Korean, Thai,
> and Arabic.

 How did you test it? Have you already written a patch to select encoding of
the file or your are selecting different "default language" in Windows Control
Panel? Just curious..
 
> All worked fine except for Korean which seemed to be
> due to something else.
>
> I've noticed these unrelated Windows i10ln problems:
>
> * Korean locale is either being confused with Chinese
> or the wrong Korean encoding as any Hangul text I load
> is displayed as 100% Hanja. Does this happen on Unix?

 Chinese people say Chinese support works perfectly under Unix.

> * CJK files need to display with a font that supports
> them. I have to change the font manually each time.
>
> * Exotic locales such as Hindi and Georgian cause
> asserts in libiconv though loading either as UTF-8
> display fine. This seems to be due Windows supporting
> as Unicode locales only.
 
 May be libiconv just doesn't know encodings used in these locales under name
cpXXXX..

> * Complex writing systems like Hindi and Thai have a
> lot of problems with editing. Some problems are
> similar to those with Right to Left languages.
>
> Andrew.

 PS: I personally can't help with anything windows-related..

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Tue Apr 10 2001 - 12:01:54 CDT