POW -- printing accented characters on Unix


Subject: POW -- printing accented characters on Unix
sterwill@abisource.com
Date: Wed Feb 16 2000 - 17:46:38 CST


It's time for another Project of the Week (POW)!

AbiWord has undergone extensive localization work recently, but
there are still some loose ends to tie up on the internationalization
front. On the front lines is the issue of WYSIWYG printing for all
platforms. Windows handles this task well through the use of the
same TrueType fonts for display and printing. All measurements of
these fonts are done through the Win32 API, so the printable sizes of
the font will match the sizes measured for on-screen display.

On Unix, AbiWord uses Type 1 fonts and their metrics files for
printed output, but trusts the X server to render, display, and measure
them on-screen. This leads to an inconsistency in the way some characters
are measured for printing (this notably affects non-ASCII characters in
the Latin-1 character space).

The X server handles the composition and rendering of these characters
so that the glyph widths returned to AbiWord are correct. When AbiWord
generates PostScript, it uses code from Adobe Systems to populate a
table of character widths before it flows text over the printed page.

Currently Adobe's code (called "parseAFM"; specifically, it can be found
at abi/src/af/xap/unix/xap_UnixPSParseAFM.[ch]) has problems computing
the correct widths of Latin-1 characters with accents. It assumes
"250 ems" for many of these characters. This logic affects many
non-ASCII characters in the Latin-1 character space. For proof of this
problem, check out the document at
"http://www.abisource.com/~sterwill/pow/accents.abw". Try printing it (to
your printer, or to a file for viewing with ghostview, gv, or similar)
and take a look at the results. Notice how the large and wide, accented
"A" characters are calculated to have a smaller width when printed.

In finding a solution to this general problem, I've done a little searching
for better implementations of parseAFM. The GNOME project has
been using a version of parseAFM modified by Raph Levien and others
to fix some outstanding bugs. This code is available as part of
the gnome-print system (from GNOME's CVS repository, in
gnome-print/libgnomeprint). I've made the GNOME versions of these
files available at http://www.abisource.com/~sterwill/pow/parseAFM.c
and http://www.abisource.com/~sterwill/pow/parseAFM.h. Integrating these
into the AbiSource tree should be relatively easy (a few changes to
the file names and one or two files that use symbols from parseAFM),
but the real task is larger.

I don't know whether these enhancements to parseAFM will help AbiWord
calculate the correct character widths when printing. Take a look
at abi/src/af/xap/unix/xap_UnixPS_Graphics.cpp for the entry point
into string measurement (PS_Graphics::measureString). xap_UnixFont.cpp
ad xap_UnixPSFont.cpp (in the same directory) contain the logic to
populate our PS context's width table, so this would be a great place
to do a little investigation and verification with your favorite
debugger.

The ultimate goal of this POW is to calculate proper widths before we
print for all Latin-1 characters, given accurate metrics in the
AFM files. This might require a little composite character calculation
work, either in the PostScript GC (xap_UnixPSGraphics.cpp) or the
parseAFM code itself.

References for Adobe Type 1 fonts (including their glyph metrics
file format) can be found at:

 * Type 1 font specification
   http://partners.adobe.com/asn/developer/PDFS/TN/PLRM.pdf

 * AFM specification
   http://partners.adobe.com/asn/developer/PDFS/TN/5004.AFM_Spec.pdf

 * List of font-related specifications
   http://partners.adobe.com/asn/developer/technotes.html#fonts

If parseAFM returns bogus widths for these characters, like it currently
does, consulting the GNOME printing architecture documentation
(http://developer.gnome.org/arch/imaging/printing.html) would be a good
idea.

Solving these problems involves a bit of creativity on your part, and
will most certainly involve a bit of research into other implementations
of font and printing technologies with the X Window System.

Good luck!

PS: For more background on the whole POW / ZAP / SHAZAM concept, see the
following introduction:

  http://www.abisource.com/mailinglists/abiword-dev/99/September/0097.html

-- 
Shaw Terwilliger



This archive was generated by hypermail 2b25 : Wed Feb 16 2000 - 17:46:39 CST