Re: how should we localize locale names?


Subject: Re: how should we localize locale names?
From: Tim Allen (tim@proximity.com.au)
Date: Thu Mar 08 2001 - 03:40:30 CST


>In scanning the recent code mods for the new lang property, I noticed
>that we're using the stringset mechanism for user-visible locale names.

You beat me to it, Paul, as soon as I saw the dialog screenshot I started
wondering about exactly this.

>Three comments:

>1. If this is desirable, ...
>-------------------------
>... then we should at least make translators lives saner by using more
>meaningful names -- eg, LANG_EN_US instead of LANG_3.

Well, that would be a step forward, but much better is the suggestion
below.

>2. Is this in fact desirable?
>------------------------------

>We're already at 25 locales and growing, which suggests that we'd need
>N-squared localizations of these locale names. For example, with just
>three locales, you'd need 9 user-visible strings:

>While it's fun to play games with Babelfish, this certainly seems like it
>could get out of hand pretty rapidly.

Absolutely!! This is _not_ the way to go... I'll even go so far as to say
that I will not provide id-ID translations for the names of all of
AbiWord's supported languages, whatever Owen Stenseth's perl script says.

>3. If this isn't desirable, ...
>----------------------------
>... then one alternative I've repeatedly suggested is to just report the
>localized name for that language *in* that language. Thus, you'd have:

> English -- United States (en-US)
> Francais -- France (fr-FR)
> Deutsch -- Deutschland (du-DE)

_Definitely_ the way to go, IMHO.

>The theory is that it's OK to have localized text here, because if you
>don't recognize the name of the language *in* that language, you're
>unlikely to be using it in your documents.

Which sounds like a pretty good theory to me.

>Plus which, it makes it far more likely that translations will be
>up-to-date. I doubt that Sioux or Catalan even *have* names for each
>other's languages, and defaulting both to en-US seems unpleasant.

Also a good point, but hardly necessary, as the previous argument was more
than adequate to demolish any possible opposition :).

>bottom line
>-----------

>I'd like to propose that we add a special case to the existing strings
>file format to include the *name* of that locale *in* that
>language. Then, when initializing the XAP_Language class, we can scan
>those strings files and just pluck out this one description.

The general idea is definitely good. But like Dom, I have slight
reservations about adding an extra wrinkle to the way our string sets get
used. This would imply that every language string set would be at least
partially loaded. I still like the idea of moving to gettext at some point
in the medium-term future; I actually would like to implement that myself,
except for minor impediments like work, moving house, a wife and son etc
etc etc :-). Adding more cruft to the existing model would seem to make
such a transition more difficult. Is there some other paradigm we can use?
We certainly don't want to have to do anything as silly as temporarily
switching locales to resolve the language name, then switching back.

Maybe we want a list of non-translatable strings somewhere, defined in
such a way that it's very easy for a new translator to add the name of the
new language to the list.

>My hope is that there shouldn't be much of a charset issue here, since
>the strings import mechanisms (via expat at least) are already
>encoding-savvy. However, we may need to do some platform-specific
>trickery to get the resulting UCS2 strings to sort and display properly
>on the resulting menus and/or dialogs.

Dom's other point is also sensible, in that you may not have any fonts on
your system capable of displaying the names of, eg Nihon go and Mandarin,
in their native character sets. I suppose the previous argument holds, ie
if you don't have the right fonts then chances are you're not planning to
use them in your document. But it would be nice to show off the languages
we support, and ugly if the language choice dialog shows random gibberish.
To do that we need not language-localised language names, but
character-set localised names (in practice, Romanised names would do, I
think, as in eg "Nihon go" for Japanese). And then some way of detecting
that we can't display the native names, and using the romanised names
instead.

This doesn't seem to fit all that nicely with the existing localisation
paradigm, nor with any likely paradigm that would be supported by gettext.
Pity. More thought required, I think.

>How does that sound?

>Paul,
>wild-eyed dreamer

Dreamer? I thought your points seemed reasonably down-to-earth and
pragmatic :-).

Tim

-- 
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/
  http://www4.tpg.com.au/users/rita_tim/



This archive was generated by hypermail 2b25 : Thu Mar 08 2001 - 03:40:55 CST