Re: i18n megapatch to AW


Subject: Re: i18n megapatch to AW
From: Thomas Briggs (tom@sane.com)
Date: Wed Oct 04 2000 - 08:36:12 CDT


   I can't get this patch to apply on Windows, though I think it's patch's
fault and not the diff's fault. (It does apply with only one small problem
on Linux, for whatever that's worth). Anybody else out there using Windows
that can get this to apply correctly (using a different version of cygwin
maybe)? I'd rather not have to commit this to CVS just to find out what
changes need to be made for Windows.

   -Tom

----- Original Message -----
From: "Vlad Harchev" <hvv@hippo.ru>
To: <abiword-dev@abisource.com>
Sent: Wednesday, October 04, 2000 4:09 AM
Subject: i18n megapatch to AW

> Hi,
>
> Sorry for so late announce of this.
> The very patch is here: http://www.hippo.ru/~hvv/abiword/awrus-patch.gz
> It's 200K uncompressed!
>
> I've setup a "russian AW" page and announced it on Sunday night on all
> russian linux news sites that matter. There were 1200 visitors so far of
> the main page. Sorry for late announce of the patch here.
>
> Sorry, I didn't try to compile it on Windows. Also there can be problems
> using patched version of AW on non-x86 platforms (with other byte order
and
> other word length) and also non-glibc systems.
>
> What's done:
> In short, everything international user (like russian) can dream of
> (mandatory fixes, and fixes adding luxury like fixing dialogs to make
them
> not using Fixed containers). There are no things remaining to be done
from
> international user's POV. Also some fixes small fixes/improvements for
> various things (only a couple, AFAIR), and translations to Russian.
>
> List of i18n-specific changes:
>
> 1) Added ability to input keys with keysyms > 256, with converting of
keysym
> values to unicode
>
> 2) Remappinng of characters from unicode to X Locale for in remapGlyph for
> drawing and printing them. At startup, AW looks into the subdirectory
> named after name of current locale's charset for locale-specific fonts.
> Locale-specific fonts can override standard fonts (if they have the
same
> font name, e.g. "helvetica", but XLFD's registry-encoding is equal to
the
> current locale's charset name). Fixed 'makewrapper.sh',
't[gb]z_install.sh'
> scripts - they now check for existance of suibdirectory with
> locale-specific fonts and if it's found it's also added to X server's
font
> path.
>
> 3) Fixed printing. Only single-byte characters are supported.
>
> 4) Fixed cutting and pasting - pasting to/from other apps works well now.
>
> 5) Nothing. Just to eat number "5" - text moved to other item :)
>
> 6) Corrected importing of RTFs. The following constucts
> {\f1\froman\fcharset2{\*\fname Symbol;}MT Symbol;}
> in \fonttbl are now supported (i.e. canonical name of the font inside of
{})
> These constructs are produced by at least Win95 russian edition. They
were
> crashing AW (RTFstate stack underflow). Also "helvetica" is substituted
> with "helvetic" to avoid problems.
>
> 7) RTF import: Added recoding of characters of form "\'e1" from windows
> codepage to unicode.
> With 6) and 7) I was able to import any RTFs I can find/produce by
WordPad
> from W95 and by Word2000 (with various output options).
>
> 8) On export to RTF, fontname "helvetica" is used unconditionally. Due to
6)
> this doesn't introduce any problems. This allows russian texts to be
read
> without flaws on Wordpad (on russian Windows, "helvetic" is
non-russified
> font, while "helvetica" is of course is).
> Also, on export to "default RTF format" all unicode symbols with value
> >127 are exported in the form \uc1\uXXXX\'HH (if there is character 0xHH
in
> windows charset being exported to, falling back to \uc0\uXXXX if it
doesn't
> exist) - thus allowing old apps to read these files without problems.
>
> 9) Added "RTF for old apps" format. Some broken programs don't understand
> \uc1\uXXXX\'HH form (e.g. Ted, StarOffice 5.2) - so \'HH form is used
(if
> there is character 0xHH in windows charset being exported to, exporting
> nothing if that character doesn't exist). I understand that sed can be
used
> for converting files from "plain RTF" to "RTF for old apps" format, but
> nevertheless.
>
> 10) When saving to .abw, "charset=" attribute is added to the 1st tag of
XML
> and all characters are saved in native encoding if it's one-byte
> encoding (i.e. raw bytes are output instead of &#XXXX; or so) - of
course if
> there is a character in "native encoding". Of course, support for
importing
> files with this format is also supported (I've tested with expat only -
> but AFAIK libxml does this out of the box).
>
> 11) Same for exporting to html (characters are output in native encoding,
the
> name of native encoding is also saved properly in html file. So,
netscape
> can display russian in such html files now.
>
> 12) When exporting to .latex, also convert to native encoding and raw
bytes
> are output. Proper \usepackage[...]{inputenc} and \usepackage[..]{babel}
> are inserted to .latex file. Now exported latex documents with russian
work
> out of the box.
>
> 13) Added support for converting all translations of UI elements (Menu
items,
> Toolbar, stringset from arbitrary encoding to native encoding). For
> menuitems and toolbaritems labelsets added macro BeginSetEnc that takes
> same parameters as BeginSet plus taking encoding name as last parameter.
> This allows to use same set of translations (supplied in any encoding)
on
> all platforms and on any locales, even if they use different charsets
(like
> russian - cp1251 is used for it on Windows and koi8-r is mostly used on
> Unix).
>
> 14) Added support for spellchecking (by fixing current ispell's code). It
was
> trying to use charset name "UCS-2-INTERNAL" or so, unknown to linux's
glibc.
> So I added workaround for glibc - "UCS2" is used for glibc. Also, when
> converting between dictionary's charset and UCS-2 (in any direction),
UCS2
> symbols should be byteswapped to get unsigned shorts (at least for
x86) -
> done that.[Note: we should check whether this is needed on arches with
> other byteorder, or on systems that don't use glibc's iconv (and also
> whether "UCS2" is known by iconv on these systems].
> Also, slightly extended a way of guessing charset of dictionary:
> if there is a file with name of dictionary with -encoding prepended
(e.g.
> "russian.hash-encoding" for "russian.hash") it's opened at its content
is
> treated as name of dictionary's charset (this is much more flexible than
> hardcoding names of charsets for some known langauges).
>
> 15) Proper implementations for UT_is{lower,upper,alpha} and *_tolower.
>
> Other changes:
> 1) Translations to russian provided (including icons for toolbar).
>
> 2) "columns" and "font" dialogs reworked not to use hardcoded
> widget positions and dimension. I gave up fixing "Paragraph" dialog -
the
> only one that needs fixing now - since it looks reasonable in russian.
> "Insert date/time" dialog reworked to expang list of all formats to the
> width of widest format. Fixed "columns" dialog - non-translated string
for
> "line between" was used - now the one is acquired from StringSet.
>
> 3) My patch for automatic recoloring of BW and threecolor toolbar icons's
> "black color" to the color used by gtk for drawing text is also
included.
>
> The most recent version of my patch can always be downloaded from the URL
> I've given.
> I will announce any changes I make to the patch.
> I don't have time at all to test windows version this week.
>
> I think it's a right time to start commiting this patch (after checking
> on other platforms). I don't know of any bugs with this patch on unix
(but AW
> probably won't compile on Windows unless slightly modified). Most changes
> needed for Win32 and other platforms will be using right name for
> "iconv_open", "iconv_close", "iconv" with the ones available on that
> platform. Other than that, nothing should prevent AW from compiling on
other
> platforms. No other changes are needed to use patches AW with latin
> languages.
> Feel free to contact me.
>
> Best regards,
> -Vlad
>
>
>



This archive was generated by hypermail 2b25 : Wed Oct 04 2000 - 08:34:17 CDT