Re: MSWord DOC


Subject: Re: MSWord DOC
From: Chih-Wei Huang (cwhuang@linux.org.tw)
Date: Sat Nov 11 2000 - 09:55:17 CST


Vlad Harchev ¼g¹D¡G
>
> > OK! Yes, it works!
> > Now import MSWord2k doc will not crashed.
> Did CJK letters get imported correctly?

Oh, yes, sure! I forgot to say that...:)
 
> It's not AW bug IMO but wv importer's one. Word2k (and Word97) have
> completely different format from Word6.0 (aka word95) in which wordpad save.
> AW uses wv for reading .doc files. There are a lot of converters (from .doc
> to .html, .latex, .rtf) shipped with wv (e.g. wvLatex). Could you try them
> with this file and say whether they understand chinese content properly?
>
> When I imported your file and saved it in .abw format, there was the
> following data in it:
> ꒤ꓥwordꓩꓥ
> Could you say, what is ꒤ ? Is it Big5 direct character codes (i.e.,
> not in unicode, but in Big5)?

Yes! It's the big5 codes. They are exactly the big5 characters I saved.

> If yes, then there is a big chance that language
> code is not saved properly in word6.0 format (or there is a "CJK flag" for
> example). So, if these chars are really big5 codes, I think we have the
> following options:
> 1) add very dirty hack that under CJK environments will treat unspecified
> language code as language code of current locale's charset.
> 2) add support to CJK to wv. Dom Lachowich on the AW mailing list is current
> maintainer of wv, so feel free to talk with him (or to debug wv and send him
> patches).
> The 2nd one is much more preferable :)

Yes. However, I know nothing about wv.
So what should I do now?
 
> BTW: .docs with russian saved by wordpad are imported fine into AW. So it
> seems this problemn exists only for CJK .doc files.
>
> Feel free to post this message to AW mailing list.

-- 
   ~     Chih-Wei Huang (cwhuang)
  'v'    E-Mail       : cwhuang@linux.org.tw
 // \\   CLDP Project : http://www.linux.org.tw/CLDP/ (Coordinator)
/(   )\  CLE  Project : http://cle.linux.org.tw/CLE/  (Developer)
 ^`~'^   HomePage     : http://www.cwhuang.idv.tw/



This archive was generated by hypermail 2b25 : Sat Nov 11 2000 - 09:59:00 CST