Re: New Unicode-->Latex characters


Subject: Re: New Unicode-->Latex characters
From: Martin Vermeer (martin.vermeer@fgi.fi)
Date: Mon Aug 14 2000 - 03:12:06 CDT


On Fri, Aug 11, 2000 at 05:22:00PM +0200, Karl Ove Hufthammer wrote:
> This patch adds a ton of characters to the Unicode to Latex conversion.
>
> --#
> Karl Ove Hufthammer

Great! These were needed.

Now there are still the Greek characters (iso-8859-7) ... not such a high
priority, probably. Except for Greek people ;-)

If someone is willing to commit this (is anyone responsible for wv ?!?),
I will do my earlier patches again and resubmit them.

And someone please remove wvLaTeX.sed from CVS. And commit wvCleanLaTeX.xml,
if it passes muster.

One comment on your change to 0x2013: It is according to the standard indeed
an en-dash, not a soft hyphen. HOWEVER! In the MS Word documents that I have
access to, 0x2013 represents both symbols. I.e. we have "MS-Unicode", not
the real thing.

Doing as your patch proposes, prints out all soft hyphens, also in the
middle of the line, as en-dashes. As there are many more soft hyphens than
en-dashes in my texts, I chose to go the other way. But the point is that both
of these approaches are broken. In the original Word document, there must be
something distinguishing these (otherwise Word couldn't handle both properly),
but by the time we get to text.c, the distinction has vanished.

I asked Caolan about this, but he couldn't help either.

Your approach is safer in that all problem characters at least remain
visible.

Good Ideas Please (tm)

Martin

-- 
Martin Vermeer mv@fgi.fi   Phone +358 9 295 55 215   Fax +358 9 295 55 200
Finnish Geodetic Institute    Geodeetinrinne 2    FIN-02430 Masala FINLAND
:wq



This archive was generated by hypermail 2b25 : Mon Aug 14 2000 - 03:12:00 CDT