Re: New charset support ?


Subject: Re: New charset support ?
From: Vlad Harchev (hvv@hippo.ru)
Date: Tue Apr 03 2001 - 04:07:52 CDT


On Tue, 3 Apr 2001, Hubert Figuiere wrote:

 Hi,

> In order to support all RTF charsets, I need to be able to convert from
> Codepage 437, 850 and from MacRoman charset to UNICODE. As I understand,
> currently only the charsets in wv/iconv are supported. How shall I proceed
> to have more ?
>
> Those charset are in iconv.

 No, wv/iconv shouldn't be even compiled in - if it is, it's a bug (it shades
iconv() from libiconv or libc's iconv).
 
 AbiWord already in theory supports any charset when importing RTFs - see RTF
importer. It uses XAP_EncodingManager::charsetFromCodepage to get the name of
encoding for given codepage number the RTF text is in, and converts all
charactres to unicode using iconv. If iconv() doesn't know an encoding under
the name of form CPXXXX for some codepage XXXX, then appropriate entries
should be added to XAP_EncodingManager.cpp:MSCodepagename_to_charset_name_map

 Importing of RTFs with encodings different from locale's encoding works fine
at least for Russian and Chinese.

> Hub
>

 Feel free to ask for any help..

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Tue Apr 03 2001 - 04:49:17 CDT