RTF importer - Asian font names (PATCHES)

From: Roland Kay <roland.kay_at_ox.compsoc.net>
Date: Sun Jun 05 2005 - 11:31:25 CEST

OK. Things have got a bit complicated so I'm attaching all
the necessary patches to this email. They should apply in
any order but on my system I apply them in this order:

1, RTF-AsianFontNames.patch (new)
2, RTF-warnings-2.patch (same as previous post)
3, XML-Props.patch (same as previous post)

No. 1 is the finalised RTF Asian font names patch. This
reads the escaped hex multi-byte font names used in Asia and
stores them as UTF-8 in the document. Thus, Chinese users get
to see the font names in Chinese characters if this was how
they were encoded in the document. Also, with appropriate
font installation or font aliasing, all valid documents can
be displayed without all the characters turning into

I've gone back to using UT_String and not UT_UTF8String
because the use of UT_UTF8String string was corrupting the
Chinese font names. The reason is as follows:

In China MSWord exports RTF with the font names encoded in
the GB charset. I read these in one character at a time into
a UT_String. Once the entire font name has been read I
convert the string to UTF8 and hand it to
RTFFontTableItem(). Thus, the UT_String never holds UTF8. IN
fact, it holds the font name in the native character set.
Trying to append the (8 bit) GB characters to a UT_UTF8String
causes them to become corrupted.

It seems much more sense to me to read the entire string in
and convert it in one go, rather than trying to convert it
one character at a time.

No 2. Fixes some warnings in the RTF importer.
(see http://www.abisource.com/mailinglists/abiword-dev/2005/Jun/0018.html)

No 3. Stop the XML validator corrupting UTF-8 encoded data.
(see http://www.abisource.com/mailinglists/abiword-dev/2005/Jun/0023.html)

Best wishes,


PS: For any Asian users interested, I've attached a copy of my
/etc/fonts/local.conf file. It may not be the most elegant
solution, but it seems to work.

Received on Sun Jun 5 11:34:43 2005

This archive was generated by hypermail 2.1.8 : Sun Jun 05 2005 - 11:34:43 CEST