Re: Character set for plain text files


Subject: Re: Character set for plain text files
From: Vlad Harchev (hvv@hippo.ru)
Date: Sun Apr 08 2001 - 08:11:12 CDT


On Sun, 8 Apr 2001, Andrew Dunbar wrote:

 Hi,

 I'm sorry for not replying earlier to your previous message..
 I may be considered as i18n engineer of AW project, XAP_EncodingManager and
all code that uses it is written by me.

> I thought AbiWord was supposed to work with encodings
> and locales other than ISO 8859-1 English but this
> seems not to be the case at least for plain text files.
>
> In xap_EncodingManager.cpp I find this:
>
> const char* XAP_EncodingManager::getNativeEncodingName() const
> {
> return "ISO-8859-1"; /* this will definitely work*/
> }
>
> const char* XAP_EncodingManager::getLanguageISOName() const
> {
> return "en";
> };
>
> So even after my Locale info is loaded by the prefs system all .txt
> files are loaded as ISO 8859-1.
>
> What's the proper way to fix this?

 There is no need to fix it. The functions you quoted are in in /xap/
directory - i.e. they are cross platform fallbacks. The details they deal with
(locale detection) are not cross-platform, so there platform-specific code in
subdirectories named after platfomrs (e.g. in subdirectory named 'unix' for
unix platforms - file is XAP_UnixEncodingManager.cpp). AFAIR unix is the only
platform for which XAP_EncodingManager-derived class that provide full
functionality, and it would be nice to bring other platforms on par with it.

 As for your original message - ability for the user to select encoding of the
file - I think it may be good idea (though in the unix world, it's easy to
achive this by using iconv utility directly). As for implementation - you
should use the text importer class residing in wp/impex/xp/ie_imp_Text.cpp,
and then if some other encoding than current locale's one selected by user,
call
        m_mbtowc.setInCharset(charsetname);
before looping over all characters in the file. I think that just one more
argument should be added to text importer's class constructor - that will
specify encoding of the text, with default value as NULL. That's it! The
major work required is implementation of platform-specific GUI for selecting a
charset from the list (i.e. implementation for each platform is needed :( ).
 
> Andrew.

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Sun Apr 08 2001 - 08:53:20 CDT