Re: request for help from CJK hackers


Subject: Re: request for help from CJK hackers
From: Vlad Harchev (hvv@hippo.ru)
Date: Fri Nov 10 2000 - 05:47:35 CST


On Fri, 10 Nov 2000, ha shao wrote:

> On Fri, Nov 10, 2000 at 11:38:00AM +0400, hvv@hippo.ru wrote:
> > On Fri, 10 Nov 2000, ha shao wrote:
> >
> > > And I think the importer should also deal with the
> > > \lang \langfe \langnp \langfenp described in
> > > http://msdn.microsoft.com/library/specs/rtfspec_16.htm#rtfspec_21
> > > Those tags are used intensively in word2k.
> >
> > They are useful only to hint which dictionary to use IMO. Currently AW
> > supports only one language for document concept, so support for these
> > constructs is useless without support from AW's infrastructure.
> >
>
> The problem is that you never know what thing will be
> produced by MS word. I attached a .rtf file produced by word 2k
> under win2k English version with Chinese locale installed.
> Don't be amazed it set both \deflang1033 and \deflangfe1033.

 Hmm, yes, it has \ansicpg1251 at the begining :(
 Anyway, we can add support for \langfe and \deflangfe - when we encounter
them, we just call

m_mbtowc.setInCharset(XAP_EncodingManager::instance->
        someMethodThatReturnsCharsetForLanguageCODE(param))
where someMethodThatReturnsCharsetForLanguageCODE is just

char* XAP_EncodingManager::someMethodThatReturnsCharsetForLanguageCODE(int LID)
{
 char* cpname = wvLIDToCodePageConverter(LID);
 UT_Bool is_default;
 const char* ret =
 search_map(MSCodepagename_to_charset_name_map,cpname,&is_default);
 return is_default ? cpname : ret;
}
 I would call it XAP_EncodingManager::charsetFromWinLanguageID(int id)
 But the biggest problem is to understand what \langfe and \deflangfe mean
 what which one to trust. If you have time and desire to research and produce
a patch, feel free to do it. It would be very useful addition to AW.

> That file makes abiword crash on importing. It also demostrates that
> only \ansicpg will not be enough.

 With my patch posted ten minutes ago, AW doesn't crash on it. Please see how
content is parsed.
 
> --
> best regard
> ha_shao
>
 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Fri Nov 10 2000 - 06:07:21 CST