Hi Roland,
this sounds very exciting!
Could you maybe create screenshots of editing chinese documents? If
the other developers agree we could put them on our website do
emphasize abi's capabilities.
Thanks,
- Rob
On 6/3/05, Roland Kay <roland.kay@ox.compsoc.net> wrote:
>
>
> Hi Guys,
>
> Here are two more RTF importer patches. They should be
> applied in the following order:
>
> 1, RTF-AltFontName-ver2.patch
> 2, RTF-warnings-2.patch
>
> The second one is very simple. It just fixes some warnings
> generated by declared but unused variables. One of these was
> introduced by my earlier patch. The other two are in the
> code that processes the \*\abirevision keyword. The code
> looks correct to me with the two unnecessary declarations
> removed. However, it might be an idea for whoever wrote that
> bit of code just to check.
>
>
> The first patch is a bit more involved. In Asia Microsoft
> Word exports RTF font tables like this:
>
> {\fonttbl
> {\f0\froman\fcharset0\fprq2{\*\panose ...}Times New Roman;}
> {\f17\fnil\fcharset134\fprq2{\*\panose ...}FZSongTi;}
> {\f18\fnil\fcharset134\fprq2{\*\panose ...}\'cb\'ce\'cc\'e5{\*\falt SimSun};}
> {\f19\fnil\fcharset134\fprq2{\*\panose ...}\'cb\'ce\'cc\'e5;}
> }
>
> NB: I've abbreviated the panose numbers.
>
> The third entry refers to a font whose name entirely made of
> Chinese characters (it's actually "SongTi") encoded in
> GB2312.
>
> Without the patch the importer ignores the first "\'" and
> then mistakes the first "cb" as the font name. It then
> ignores the rest. The result is that any Chinese font like
> this gets named with a two letter hex code, which looks
> pretty silly. Worse, since AbiWord subsequently find no font
> "cb" on the system all the Chinese characters come out as
> circles. The only way to view the document is then to
> "Select All" and choose a sensible font, which mucks up the
> formatting of the document.
>
> With the patch in place, the importer skips escaped hex
> sequences in the font names. If an alternative font name is
> given and the main font name is blank (either because it was
> really blank, or else because it had no ASCII characters)
> then the importer substitutes the alternative fontname for
> the real one.
>
> The result is that if the exporting application bothers to
> give alternative fontname, Chinese fonts have sensible
> predictable names. Sadly, while MSWORD gives alternative font
> names for the most common CJK fonts, it doesn't do so for
> all. Thus, in the case of a font which only has a non-ASCII
> name the patch substitutes "UnknownUnicodeFontName".
>
>
> A fringe benefit is that the font table parser is more robust
> than before and will correctly handle strange, but
> apparently legal, entries like:
>
> {\f20\froman Times New {\*\unknowncommand Fibble!}Roman;}
>
> If you're running AbiWord on Linux, this doesn't solve the
> problem of Chinese characters being represented as circles
> because the names of the Chinese fonts are different from
> Windows. This, abi can't find "SimSun", in the case of the
> above example, either. I guess this might not be a problem
> on a Windows machine. However, since the Chinese fonts now
> have sensible names we can create a font alias for SimSun.
> Once this is done Chinese documents can be opened and
> displayed immediately. By aliasing UnknownUnicodeFontName
> to the most common font for their region the user can then
> display most of the documents they receive if if the
> exporting app doesn't give an alternative name.
>
> On Suse, these aliases can be made by adding the following
> to /etc/fonts/local.conf and then running fonts-config.
>
> <!-- Alias SimSum to FZFangSong so that Chinese docs show up in AbiWord
> - R.Kay (02-06-05) -->
> <alias>
> <family>SimSun</family>
> <prefer>
> <family>FZSongTi</family>
> </prefer>
> </alias>
> <alias>
> <family>SimHei</family>
> <prefer>
> <family>FZHeiTi</family>
> </prefer>
> </alias>
> <alias>
> <family>UnknownUnicodeFontName</family>
> <prefer>
> <family>FZSongTi</family>
> </prefer>
> </alias>
>
>
> Issues outstanding:
> -------------------
>
> 1, It would be nice if abi could read the real Chinese
> font name rather than relying on the alternative
> name. Modifying the above patch to achieve this is
> trivial, and in fact I already have code to do this
> since that was my original intention.
>
> Unfortunately, it appears from the code that abi
> assumes that font names will only contain ASCII
> characters. For instance, when building lists of
> character properties the arrays seem to be of
> type XML_Char which if typedefed to char. I'm afraid
> that redefining XML_Char as UCS4 will cause problems
> throughout the program.
>
> Would allowing font names to contain arbitrary
> unicode characters be feasible in 2.6?
>
> 2, OpenOffice can identify the font as Chinese and
> automatically substitute an appropriate alternative
> without needing any font aliases. Does AbiWord have
> any such font substituting capability? Does anyone
> know how OO does this? I gather from some of the
> commons on bugzilla that this may not be abi's job.
>
>
> References:
> -----------
>
> These bug reports are related to these issues:
>
> http://bugzilla.abisource.com/show_bug.cgi?id=3312
> http://bugzilla.abisource.com/show_bug.cgi?id=3954
>
> Best wishes,
>
> R.
>
>
>
>
Received on Fri Jun 3 14:09:29 2005
This archive was generated by hypermail 2.1.8 : Fri Jun 03 2005 - 14:09:29 CEST