From: Paul Rohr (paul@abisource.com)
Date: Mon Apr 29 2002 - 20:04:40 EDT
One offshoot of the whole i18n/Pango discussion recently is that it finally
dawned on me just how powerful our *existing* Unicode support in 1.0 already
is -- without BiDi or Pango.
Provided that users can locate appropriate fonts, that is.
It might be helpful to segregate the languages we support into the following
broad categories:
1. easy
2. easy, with the right font
3. bidi
4. complex shaping required (including combining characters)
As the World.abw test document demonstrates, there are a *lot* of languages
which fall into the first two categories.
the "just fonts" languages
--------------------------
Not only are there thirty-some Latin-1 languages which definitely fall into
the first category (most fonts support them), but some of the small,
general-purpose Unicode fonts being deployed add "just enough" glyphs to
support an even broader range of languages.
http://www.abisource.com/mailinglists/abiword-dev/02/Apr/1036.html
Indeed, after doing some more digging, we can support content in many more
languages by just locating a font that includes enough glyphs in the
appropriate Unicode range.
http://www.alanwood.net/unicode/fonts.html
For example, the government of Nunavut has recently created Unicode fonts
for Inuktitut:
http://www.assembly.nu.ca/unicode/fonts/
http://www.assembly.nu.ca/unicode/fonts/beginner.html
I can't read them, of course, but they sure look pretty. :-)
the "harder" languages
----------------------
Of course, there *are* languages for which we'll need more than just fonts.
For example, Tomas has hand-coded a lot of support for bidi languages, a
category which includes:
ar, fa, he, ur, yi
Now we're investigating Pango since, in addition to BiDi support, it should
(eventually) encapsulate knowledge about the more complex typographic needs
of languages which don't have discrete Unicode codepoints for all of the
glyphs needed. Andrew keeps mentioning Vietnamese (vi-VN), and I know that
other South Asian languages need this, but how extensive is the rest of this
category?
the question
------------
OK, i18n experts ... is this a useful, clean distinction? If not, please
let me know what I've garbled here.
bottom line
-----------
I'm thrilled that we've got dedicated folks working on solving the "harder"
language problems. However, I'd love to see some folks do more research on
improving our support for "just fonts" languages as follows:
- come up with a complete list of such languages
- come up with a list of the fonts needed to support each of them
Note that this is essentially a web research task, not a coding task. The
ultimate goal would be to learn enough so that we could write a quick
website entry for each language, telling users:
- who's responsible for the translation
- where to find dictionaries (if any)
- where to find fonts
- etc.
For example, two sample entries might be
Indonesian (id-ID)
------------------
translators: Tim Allen, ...
dictionary: (n/a)
fonts: ...
sample: (the UTF-8 gobbledygook from World.abw)
picture: (screenshot of the same)
Inuktitut (iu-CA)
-----------------
translators: (n/a)
dictionary: (n/a)
fonts: http://www.assembly.nu.ca/unicode/fonts/
sample: (the UTF-8 gobbledygook from World.abw)
picture: (screenshot of the same)
Best of all, this could increase our language support for the 1.0.* series
of products, while waiting for all the hard coding work to get done for the
set of other languages which actually *do* need BiDi and/or Pango.
Does this sound interesting? Is anyone interested in coordinating such an
effort? It seems like a large task to write up as a uPOW.
Paul
This archive was generated by hypermail 2.1.4 : Mon Apr 29 2002 - 20:05:31 EDT