Re: localization formats proposal


Subject: Re: localization formats proposal
From: Karl Ove Hufthammer (huftis@bigfoot.com)
Date: Fri Jul 13 2001 - 10:08:28 CDT


su 08 jul 2001 17:07:10, Rui Miguel Seabra
<rms@greymalkin.yi.org>:

> Notice I am using _ for choosing the shortcut character in
> the menus,

A good idea. Having to use &amp; was really annoying (though using
an XML editor, this wouldn't be a problem).

> Original file:
>
> <AbiLocale app="AbiWord" ver="1.0" language="en-US"
> fallback="true" enc="iso=8859-1">

The character encoding used in an XML file is specified in the
'encoding' attribute of the XML declaration. Therefore, the 'enc'
attribute is both unnecessary and wrong.

The 'fallback' attribute should be a comma-separated list of
locales.

Is the 'app' attribute neccessary? If so, we should change the
'AbiLocale' to 'Locale'.

And language should be 'locale'. Do not confuse locales and
languages.

> <strings>
> <string id="DLG_Apply">Apply</string>
> <string id="DLG_Break_Insert">Insert Break</string>
> </strings>

The original text *must* be present. If not, you have to manually
look through hundred of strings to see if one of the has changed.
And they do change. My suggestion:

<string id="DLG_Apply">
  <original>_Apply</string>
  <translated>_Bruk</string>
</string>

> <tb id="AP_TOOLBAR_ID_FILE_NEW" value="New">
> <icon>tb_new.xpm</icon>
> <!-- this is not needed here since it's empty:
> <tooltip></tooltip> -->
> <status>Create a new document</status>
> </tb>

I would prefer:

<tb id="AP_TOOLBAR_ID_FILE_NEW">
  <icon>tb_new.xmp</icon>
  <original-label>New</original>
  <translated-label>Ny</translated>
  <original-status>Create a new document</original-status>
  <translated-status>Opprett eit nytt dokument</translated-status>
</tb>

Perhaps we'll need a 'display' attribute too, for icons
that are not need in some locales. (Note: *locales*, not
languages. E.g, bidi may still be needed in the 'en-US' locale,
though you'll never use in the 'en-US' language.)

(And something similar for menus.)

This may seem overly verbose, but it's very easy to update the
translations this way, using XSLT. With 'update', I mean adding
new strings from the 'en-US' file to the localized file(s), and
marking changed text as 'fuzzy'. We can also generate (X)HTML
reports from the (updated) locale file showing which strings need
translation/updating.

One important thing is that this way, there would be *no*
technical difference between the 'en-US' locale and other locales.
IME, having one locale be the 'default' locale (using built-in
strings or something similar) is an disadvantage.

> On the developers side, this would mean a slightly slower
> startup of abiword,

I read one article where the author had tried benchmarking the
performance difference using a preparsed XML file in binary
format, and parsing it on the fly. The surprising result was that
the latter was actually faster. (I can probably find the article
in question by digging through my overgrown bookmark collection,
if anybody's interested.) Though this may not always be the case,
I believe the cost of parsing the files at startup are truly
minimal.

> but since menus and toolbars won't be
> builtin anymore (THEY ALL ARE),

And we can distribute truly localized builds (with no 'en-US'
strings). Any localized icons need to be included in all builds,
though.

Hmm, perhaps we can have 'locale packs', where all the locale info
is stored in one (external) file. Then the installer can download
the locales the user's most interested in.

-- 
Karl Ove Hufthammer



This archive was generated by hypermail 2b25 : Fri Jul 13 2001 - 10:31:30 CDT