Re: how should we localize locale names?


Subject: Re: how should we localize locale names?
From: Vlad Harchev (hvv@hippo.ru)
Date: Thu Mar 08 2001 - 03:23:18 CST


On Wed, 7 Mar 2001, Paul Rohr wrote:

> In scanning the recent code mods for the new lang property, I noticed that
> we're using the stringset mechanism for user-visible locale names.
>
> Three comments:
>
> 1. If this is desirable, ...
> -------------------------
> ... then we should at least make translators lives saner by using more
> meaningful names -- eg, LANG_EN_US instead of LANG_3.
>
> 2. Is this in fact desirable?
> ------------------------------
> We're already at 25 locales and growing, which suggests that we'd need
> N-squared localizations of these locale names. For example, with just three
> locales, you'd need 9 user-visible strings:
>
> English -- United States (en-US)
> French -- France (fr-FR)
> German -- Germany (du-DE)
>
> Allemand -- Allemagne (du-DE)
> Anglais -- Etats-Unis (en-US)
> Français -- France (fr-FR)
>
> Englisch -- Vereinigter Staaten (en-US)
> Französisch -- Frankreichs (fr-FR)
> Deutsch -- Deutschland (du-DE)
>
> While it's fun to play games with Babelfish, this certainly seems like it
> could get out of hand pretty rapidly.
>
> 3. If this isn't desirable, ...
> ----------------------------
> ... then one alternative I've repeatedly suggested is to just report the
> localized name for that language *in* that language. Thus, you'd have:
>
> English -- United States (en-US)
> Français -- France (fr-FR)
> Deutsch -- Deutschland (du-DE)
>
> The theory is that it's OK to have localized text here, because if you don't
> recognize the name of the language *in* that language, you're unlikely to be
> using it in your documents.
>
> Plus which, it makes it far more likely that translations will be
> up-to-date. I doubt that Sioux or Catalan even *have* names for each
> other's languages, and defaulting both to en-US seems unpleasant.
>
> bottom line
> -----------
> I'd like to propose that we add a special case to the existing strings file
> format to include the *name* of that locale *in* that language. Then, when
> initializing the XAP_Language class, we can scan those strings files and
> just pluck out this one description.
>
> My hope is that there shouldn't be much of a charset issue here, since the
> strings import mechanisms (via expat at least) are already encoding-savvy.
> However, we may need to do some platform-specific trickery to get the
> resulting UCS2 strings to sort and display properly on the resulting menus
> and/or dialogs.

 As for this - it's already working AFAIR (I coded that) - all strings (EXCEPT
the ones in menus) are translated to current charset when showing. I will
prepare a patch (beofe next release) to make MENU translations automatically
translatable to system charset too (it's required for russian, it has 4 (!)
encodings, of which 2 are equally wide spread).

 Also, I'm sorry for talking a bit off-topic in this thread: recently somebody
complained that we don't have any way to associate ispell dictionary name with
particular locale (e.g. use spanish.hash under Spanish locale). It's wrong -
such technology is already implemented since 0.7.12 - see system.profile*
files we ship - e.g. system.profile-ru as we ship currently already contains
the following to activate russian.hash as ispell dictionary by default:

<SystemDefaults
    SpellCheckWordList="russian.hash"
    RulerUnits="cm"
/>

When reading system.profile at startup, AW also looks for the files
system.profile-SUFFIX where suffix is
        language code (e.g. ru)
        locale name (e.g. ru-RU)
        encoding name (e.g. KOI8-R)
So, all locale-specific or language specific settings could be put in such
special system.profile-* files.

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Thu Mar 08 2001 - 04:19:05 CST