Re: request for help from CJK hackers


Subject: Re: request for help from CJK hackers
From: Vlad Harchev (hvv@hippo.ru)
Date: Thu Nov 09 2000 - 04:55:32 CST


On Thu, 9 Nov 2000, Vlad Harchev wrote:

> On Thu, 9 Nov 2000, Chih-Wei Huang wrote:
> > [...]
> > 1. Some Chinese characters are eaten or mis-interpreted.
> > After a quick analysis, I found for the highest bit of second byte
> > of big5 character being 0, the character is exported incorrect.
> > For example, the character 'ĪJ' 0xa44b
> > is saved as \'a4J. Even if I hack the it into \'a4\'4b in the RTF,
> > the imports still incorrect.
>
> Hmm - that seems to be a weird problem. Could you please apply the fix to RTF
> importer posted couple of minutes ago? I can't think of a reason why it

 Oops. I mean fix for RTF exporter.

> works this incorrect way for you.
>
> > 2. The exported RTF cannot be read by MSWord 2000.
> > All Chinese character didn't display.
>
> Try the following:
> Substitute all \fcharset0 with \fcharset134 (if your text is in GB2312) (or
> change argument to \fcharset to the one Word200 uses).
> Also, please save a small rtf file exported by AW and rtf file saved by
> word2k and send them to me.

 Also ensure that CJK chars are in proper encoding.

>
> > 3. I create an RTF by MSWord 2000 and read by AW,
> > AW crashed immediately:
> >
> > ** ERROR **: file ie_imp_RTF.cpp: line 492 (UT_Bool
> > IE_Imp_RTF::PopRTFState()): assertion failed: (pState != NULL)
> > aborting...
> > Aborted
>
> It seems AW encountered extra "}" in RTF (not that RTF is bad, but AW didn't
> count { and } properly). Please explore where this happens. I guess it happens
> when parsing \fonttbl - i.e. in the very begining.
>
> > 4. Copy & paste still didn't work...
>
> Should be fixed if fix posted couple minutes ago is applied.
>
> Just curious - how AW imports Word2k's .doc files?

 I meant .doc's with CJK chars.

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Thu Nov 09 2000 - 05:22:48 CST