Re: i18n of abiword -- combining characters (Gujarati)


Subject: Re: i18n of abiword -- combining characters (Gujarati)
From: Pierre Abbat (phma@oltronics.net)
Date: Sat Jan 15 2000 - 20:37:48 CST


I just pulled up the Gujarati Unicode chart
(http://charts.unicode.org/Unicode.charts/normal/U0A80.html). As you mentioned
for Thai, there aren't all the glyphs there.

The vowels are listed twice: first as isolated letters (a85-a94) and then as
combining forms (abe-acc). This indicates that the chart is trying to list
glyphs, not letters. However, the combining forms of R are not listed, only the
standalone form (ab0). The combining forms of R are far more common than the
virama (acd), of which there may be only one in the entire Bible (the virama was
used in Sanskrit to indicate that a word ends in a consonant, which is rare in
Gujarati). There is no code for K+SS (which has its own totally different
glyph, as I mentioned earlier) or J+NY (likewise).

If you try to alphabetize with this chart, you will be led astray. Anusvara
(a82) and visarga (a83) are alphabetized between the vowels and the consonants
(ignoring a rule that changes the n in e.g. dashanshamanthi "1/10 (some form)"
from anusvara to a nasal consonant), not before the vowels as they are in
Unicode.

phma



This archive was generated by hypermail 2b25 : Sat Jan 15 2000 - 20:58:34 CST