commit: Re: RTF importer - Asian font names (PATCHES)

From: <msevior_at_physics.unimelb.edu.au>
Date: Wed Jun 08 2005 - 16:45:24 CEST

>

CVS: ----------------------------------------------------------------------
CVS: Enter Log. Lines beginning with `CVS:' are removed automatically
CVS:
CVS: Committing in .
CVS:
CVS: Modified Files:
CVS: src/wp/impexp/xp/ie_imp_RTF.cpp src/wp/impexp/xp/ie_imp_RTF.h
CVS: ----------------------------------------------------------------------
Roland Kay's RTF font name fixes. (RTF-AsianFontNames.patch)

With a small fix by me (used == for an asignment.)

Martin

>
> OK. Things have got a bit complicated so I'm attaching all
> the necessary patches to this email. They should apply in
> any order but on my system I apply them in this order:
>
> 1, RTF-AsianFontNames.patch (new)
> 2, RTF-warnings-2.patch (same as previous post)
> 3, XML-Props.patch (same as previous post)
>
>
> No. 1 is the finalised RTF Asian font names patch. This
> reads the escaped hex multi-byte font names used in Asia and
> stores them as UTF-8 in the document. Thus, Chinese users get
> to see the font names in Chinese characters if this was how
> they were encoded in the document. Also, with appropriate
> font installation or font aliasing, all valid documents can
> be displayed without all the characters turning into
> circles.
>
> I've gone back to using UT_String and not UT_UTF8String
> because the use of UT_UTF8String string was corrupting the
> Chinese font names. The reason is as follows:
>
> In China MSWord exports RTF with the font names encoded in
> the GB charset. I read these in one character at a time into
> a UT_String. Once the entire font name has been read I
> convert the string to UTF8 and hand it to
> RTFFontTableItem(). Thus, the UT_String never holds UTF8. IN
> fact, it holds the font name in the native character set.
> Trying to append the (8 bit) GB characters to a UT_UTF8String
> causes them to become corrupted.
>
> It seems much more sense to me to read the entire string in
> and convert it in one go, rather than trying to convert it
> one character at a time.
>
>
> No 2. Fixes some warnings in the RTF importer.
> (see http://www.abisource.com/mailinglists/abiword-dev/2005/Jun/0018.html)
>
> No 3. Stop the XML validator corrupting UTF-8 encoded data.
> (see http://www.abisource.com/mailinglists/abiword-dev/2005/Jun/0023.html)
>
>
> Best wishes,
>
> R.
>
>
>
> PS: For any Asian users interested, I've attached a copy of my
> /etc/fonts/local.conf file. It may not be the most elegant
> solution, but it seems to work.
>
Received on Wed Jun 8 16:48:52 2005

This archive was generated by hypermail 2.1.8 : Wed Jun 08 2005 - 16:48:52 CEST