Re: announce of patch to support CJK in AW


Subject: Re: announce of patch to support CJK in AW
From: Vlad Harchev (hvv@hippo.ru)
Date: Sun Oct 29 2000 - 02:38:17 CST


On Sun, 29 Oct 2000, hj wrote:

 Hi,

 Have anybody tried it under CJK locale? (I didn't).
 Are there any problems with it?

> It's the best that all non-English country save unicode in *.abw better than
> mbs. We'll display characters with several language in one document in the
> future.
>
> ----- Original Message -----
> 发件人: Vlad Harchev <hvv@hippo.ru>
> 收件人: <abiword-dev@abisource.com>
> 抄送: Belcon <rainfall@yeah.net>; hj <huangj@citiz.net>; Chih-Wei Huang
> <cwhuang@linux.org.tw>
> 发送时间: 2000年10月28日 3:33
> 主题: announce of patch to support CJK in AW
>
>
> > Here is a location of the patch:
> > http://www.hippo.ru/~hvv/abiword/aw-cjk.diff.gz
> >
> > This patch can be cleanly applied over vanilla 0.7.11 patched with the
> > following patches (I hope they are still there):
> > ftp://seviorpc.ph.unimelb.edu.au/pub/abi-oct24-cvs.patch.gz
> > ftp://seviorpc.ph.unimelb.edu.au/pub/wv-oct24-cvs.patch.gz
> >
> > What's there:
> > 100% of the HJ's patch logic is there. The code and logic was greatly
> cleaned
> > up compared to original patch and should be compilable (didn't test) on
> any
> > platform (HJ's patch was making use of glib in xp code). Also, the logic
> of
> > HJ's is disabled if current locale is not CJK one. That was extensively
> tested
> > with Dom's german document and (in full, all aspects) with russian.
> >
> > The added functionality:
> > * UT_Wctomb and UT_Mbtowc are used for converting between various charsets
> > from now. They use iconv internally (so now they became working, and
> portable
> > and also allow to chose input/output encoding). All usage of iconv
> everywhere
> > (except wv) correctly swaps bytes of UCS (correct order is detected at
> > runtime).
> >
> > * Thanks to HJ, AW emits only necessary fonts to the .ps when printing. It
> > reduced the size of .ps file generated by AW by 5 times for one
> font-enriched
> > document of me (title for some paper).
> >
> > * Fixed bug with spellchecker ("replace" button didn't work).
> >
> > * Now AW looks for more files kinda
> ${prefix}//AbiSuite/AbiWord/system.profile
> > - also ${prefix}/AbiSuite/AbiWord/system.profile-${SUFFIX}, for the
> following
> > values of ${SUFFIX}:
> > 'language', 'charset', 'language-Country',
> 'language-Country.charset'
> > This allows to ship language-specific defaults (e.g. metric system or
> > name of spellchecker dictionary).
> >
> > * As for fonts, AW now tries to load fonts from the following
> subdirectories
> > of ${prefix}/AbiSuite/fonts
> > 'language', 'charset', 'language-Country',
> 'language-Country.charset'
> > This should solve CJK's people problems (before this patch, AW was
> looking
> > only in subdirectory 'charset').
> > Fonts of 'fonts.dir' format should be placed in them. Under CJK
> locales,
> > the file with list of fonts is also named 'fonts.dir', but it has the
> same
> > extended format as HJ's 'fonts.hj' has. Consider this when trying.
> >
> > * If GNOME_XML2 is undefined UnknownEncodingHandler is set on XML parser
> > in ie_imp_XML.cpp
> >
> > * Some translator's names added to CREDITS.TXT
> >
> > * Just to underline: support for "wrap-at-any-CJK-letter" logic of layout
> > is already there too (thanks to HJ).
> >
> > I think that this patch can be committed since it doesn't break
> anything
> > under non-CJK locale (at least if you did 'make clean' after applying).
> >
> > I ask CJK people to test the following, in the order of precedence:
> > * Input of CJK letters in various charsets. It should work. Insure twice
> that
> > gtk+ is installed correctly, that fonts are in the font path, etc.
> > Currently you have to have all CJK fonts AW uses available in fontpath
> > before the start of AbiWord wrapper (it's not yet updated to look into
> all
> > subdirectories AW looks now).
> > * Cutting and pasting immediately. If this doesn't work, then test RTF
> importers
> > and exporters (AW uses them internally for cut/paste). Very minor tweaks
> > would be needed to make it working if it doesn't work. If
> cutting/pasting
> > works, try exporting/importing of RTF and testing it with other apps
> (e.g.
> > word).
> > * Cutting/pasting to/from other apps and saving/reading plain text files.
> > It should just work.
> > * Saving and loading of native AW file format. Should work. If import of
> .abw
> > doesn't work, then try removing "encoding=FOO" from the 1st line of it.
> > * Printing. Since 100% of HJ's logic is there, it should just work.
> > * Export to html. Should work. If it doesn't tell what changes are needed
> to
> > make browsers understanding it (keep in mind that xhtml importer should
> be
> > able to read produced file).
> > * Checking that export of CJK texts to LaTeX works if correct prologue is
> > added to exported document. That prologue should be added to the tables
> in
> > xap_EncodingManager.cpp
> > * import of CJK doc files (most probably it won't work due to wv's
> singlebyte
> > encoding limitations). wv should be hacked to allow importing .doc
> files.
> >
> > What won't work with CJK text:
> > * export to WML, DocBook. I just don't know how to specify charset name in
> > these formats.
> > * import of XHTML (html importer assumes UTF8) and DocBook.
> > * export to Word. It doesn't work for Latin1 yet, so forget it.
> > * No other-than-unix specific code is touch, so CJK support is in the same
> > state on platforms other than unix.
> > * Spellchecking of CJK texts. Does it ever makes sense? English words can
> be
> > spellchecked inside CJK text.
> >
> > Donations/fees are appreciated.
> > If anybody needs it, I can provide commercial support and extension of
> this
> > work.
> >
> > Enjoy.
> >
> > I'm going to bed, so I won't be able to read/post/hack in next 11 hours.
> >
> > Best regards,
> > -Vlad
> >
> >
>

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Sun Oct 29 2000 - 02:57:10 CST