Re: some minor postscript fixes


Subject: Re: some minor postscript fixes
From: Vlad Harchev (hvv@hippo.ru)
Date: Sun Oct 29 2000 - 07:05:11 CST


On Sun, 29 Oct 2000, F J Franklin wrote:

 Hello,

> Dear All,
>
> Despite the odieresis on my screen, the postscript printout insisted on a
> degree-sign (RemapGlyphsDefault), so I thought that was a good reason to
> stop watching from the side-lines and try to Fix The Bug, so 16 hours later
> I have traced it to xap_EncodingManager and discovered a few more (mostly
> trivial) on the way.

 Are you running RH7 or any other system with new glibc (or do you use
non-x86 arch)? This problem appears only on new-glibc systems if my cjk patch
is not applied. Try applying it - and this problem should disappear.

> I would be happy to create a patch, but unfortunately I have never created
> a patch before so I would be grateful if someone could give me (or point me
> towards) a tutorial...
>
> Also, I should probably wait until the CJK stuff is committed?

 Could you try right now CJK stuff?
 It has autodetection of iconv's UCS2 byte order, so after you've applied this
patch the problem should disappear.
 It also changes the logic of remapGlyph/drawChar signtifically (it works in
more clean way).
 
> Frank
> <fjf@alinameridon.com>
>
> Affected files:
> abi/src/af/xap/unix/xap_UnixPSGenerate.cpp
> abi/src/af/gr/xp/gr_Graphics.cpp
> abi/src/af/xap/unix/xap_UnixFont.cpp
> abi/src/af/xap/xp/xap_EncodingManager.cpp
>
> =========================================================================
>
> abi/src/af/xap/unix/xap_UnixPSGenerate.cpp
>
> attitude in xap_UnixPSGenerate to line-length limit (or is that just an EPS
> limit?) seems a little schizophrenic. Anyway:
>
> (1) UT_Bool ps_Generate::writeByte(UT_Byte byte)
> UT_Bool ps_Generate::writeBytes(UT_Byte * pBytes, UT_uint32 length)
>
> add conversion to PS escape sequence \ddd (i.e., octal) for characters
> in range 127 <= c < 256, return UT_FALSE above this although

 UT_Byte can't be > 0xff IMO (0xff is the maximum value that fits in
UT_Byte).
 As for encoding chars with values > 126 as \ddd - I would like to object
against it. Single-byte non-latin encodings like russian have all characters
with value > 126, so it will make raw .ps documents unreadable with plain
viewer.

 Printers like HP laserjet print .ps files with raw characters > 126 fine. I
hope PS standard also allows characters with value > 126 in .ps files.

> UT_uint32 PS_Graphics::measureUnRemappedChar(const UT_UCSChar c)
> returns null-width for chars outside this range anyway...
>
> (Is this a good thing?)

> (2) UT_Bool ps_Generate::formatComment(const char * szCommentName,
> const char **argv, int argc)
> UT_Bool ps_Generate::formatComment(const char * szCommentName,
> const UT_Vector * pVec)
>
> were skipping arguments at `%%+' line-wraps
>
> =========================================================================
>
> abi/src/af/gr/xp/gr_Graphics.cpp
>
> (1) UT_UCSChar GR_Graphics::remapGlyph(const UT_UCSChar actual_,
> UT_Bool noMatterWhat)
>
> out-of-order null-check of m_pApp
>
> l.110 UT_UCSChar actual = m_pApp->getEncodingManager()->try_UToNative(actual_);
>
> l.159 if (!m_pApp)

 I had an impression that it can't be NULL.
 My CJK patch removed that line (the logic is moved to other location -
charWidth or similar method).
 Also m_pApp->getEncodingManager() approach is not a clean way of getting a
ptr to an instance of EncodingManager, the clean way is
 'XAP_EncodingManager::instance'
 
> =========================================================================
>
> abi/src/af/xap/unix/xap_UnixFont.cpp
>
> (1) const FontMappingTable std_enc[] = {
>
> add "mu" = 0x00B5
>
> correct spelling of "Ocircumflex"
> correct spelling of "Idieresis"
>
> ?? I thought "trademark" was ^{\bf TM} (to use TeX syntax); in this
> context "trademark" may just be the same as "registered" = 0x00AE
>
> if "Scaron" == "germandbls", then
> ?? "scaron" may just be the same as "Scaron" = 0x00DF
> ?? "zcaron"
> ?? "Zcaron"

 I don't know...

> (2) ABIFontInfo * XAP_UnixFont::getMetricsData(void)
>
> When constructing the m_uniWidths (character widths) array:
>
> if (XAP_EncodingManager::instance->try_nativeToU(0xa1)==0xa1)
> {
> /* it's iso8859-1 or cp1252 encoding - glyphs in font are in wrong
> order - we have to map them by name.
> */
> [...]
> }
> else
> {
> /* it's non-latin1 encoding - we have to assume that order of
> glyphs in font is correct.
> */
> [...]
> }
>
> Fonts such as Nimbus Roman No9 L Regular (AbiSuite/fonts/n021003l.*)
> require the first of these methods (i.e., it's iso8859-1 or cp1252
> encoding) but the second is being called.

  On new glibc, iconv returns values byte-swapped, i.e. 0xa100, so the if ()
fails.

> The problem is with the zero (or whatever) returned from
> static UT_UCSChar try_CToU(UT_UCSChar c,iconv_t iconv_handle)

   Various iconv implementations return various values. They reliably return
 -1 on failure.
 
> Question: Rather than doing either/or, would it not be safer to do
> both (i.e., non-latin1 followed by iso8859-1/cp1252)?

  No for single-byte non-latin lagnuages (e.g russian).

> =========================================================================
>
> abi/src/af/xap/xp/xap_EncodingManager.cpp
>
> (1) static UT_UCSChar try_CToU(UT_UCSChar c,iconv_t iconv_handle)
>
> Very confused types & casting (though that's not to blame...)
>
> size_t donecnt = iconv(
> [...]
> uval = (b1<<8) | b2;
>
> assumes that donecnt == 2; replacing the latter with
>
> if (donecnt == 2) uval = (b1<<8) | b2; else uval = b1;
>
> works for me - unfortunately I know very little about the ways of
> iconv (), so is this valid?

 No, most frequently 'donecnt' will be 0. So, use 'donecnt!=(iconv_t)-1'
 as a test for successful conversion.

 Thanks.

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Sun Oct 29 2000 - 07:34:04 CST