Re: Cyrillic in AW documents


Subject: Re: Cyrillic in AW documents
From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Mon Sep 10 2001 - 12:50:38 CDT


 --- Tomas Frydrych <tomas@frydrych.uklinux.net>
wrote: >
> Hi Andrew,
>
> > Hi Tomas. This makes good sense. However if I
> recall
> > correctly, XML documents, including .abw documents
> > should then declare in the header which encoding
> they
> > are using. In the sample document we declared
> > nothing.
> > Can somebody please look into this?
>
> I have done some tests, and this is what the
> situation looks like on
> my machine:
>
> - win version of AW always saves in utf-8 without
> declaring the
> encoding in the xml header (which is fine, that is
> the xml default).
> When given a document coded in a different encoding
> (tested with
> iso-8859-8), it loads it correctly.
>
> - Linux version of AW honours the encoding part of
> LANG, and
> even if the explicit encoding is utf-8, it will
> declare it in the xml
> header. It also correctly loads documents with
> different encoding
> than that of the current locale, including the one
> using utf-8 without
> explicit declaration in the xml header.
>
> As far as I can see, the docs produced on different
> platforms are
> mutually compatible. So the only 'problem' I detect
> here is that the
> two versions behave slightly differently, i.e., the
> win version does
> not use the encoding of the present locale when
> saving docs. I
> would prefer if it did, but this is not a big deal.
>
> The bottom line, I am not sure what is causing the
> problem with the
> Cyrillic docs. Is it possible that this has to do
> with the way iconv
> (or the xml parser) handle the encoding codes (case
> sensitivity,
> dashes vs. underscores, etc.) on the different
> platforms?

Hi again Tomas. Off the top of my head I have the
feeling that *nix and Windows use different XML
libraries at the moment. One uses expat and I forget
the other. Though on at least one platform, you can
build with the other library as well. My hunch is
that these two libraries handle the encoding
declaration slightly differently.

Andrew Dunbar.

=====
http://linguaphile.sourceforge.net

____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie



This archive was generated by hypermail 2b25 : Mon Sep 10 2001 - 12:50:42 CDT