Re: i18n of abiword -- combining characters


Subject: Re: i18n of abiword -- combining characters
From: Pierre Abbat (phma@oltronics.net)
Date: Sat Jan 15 2000 - 17:59:15 CST


It's even worse in Gujarati and other Nagari scripts. Take the word "murti"
(idol). The i is short, so it is written before the t. The r is written above
the t, as if it were an exponent. It looks like this (the dandas belong to the
vowels, but only on consonants that have them, which the m and t do).

                    r
            iii r
           i ii r
 m u i i
 m u i i
 m u i tttti
mmmmmmu i t i
 m u i t i
      u i t i
      u i t i
     uu
    u u
     u u
        u

Then there are the ligatures. In the word "atma" (soul), the t loses its danda
and is stuck to the m. The first and last letter a are the same letter, but
look different.

  aaa a a m a a
     a a a m a a
     a a a a tttt m a a
  aaa aaaa a t mmmmmma a
    a a a a t m a a
     aaa a a t a a
           a a t a a

And the r, previously mentioned, has three forms: the candra, or moon in the
exponent position; its full form, identical with the numeral 2; and a diagonal
line with positive slope, used when the r follows another consonant.

That's fine, as long as the letters have dandas. But some don't. They have
their idiosyncratic ways of combining, which just have to be memorized. The
four letters "rkshi" (the retroflex sh) combine to make one character which
looks totally different from the four letters individually. And even a few CV
combinations are unique: ru and, in Gujarati, ji.

If I were typing Gujarati, I'd want the computer to take care of the whole
thing. It's hard enough learning a 50-character keyboard without having to
remember what shift key to use when the r precedes a consonant.

phma



This archive was generated by hypermail 2b25 : Sat Jan 15 2000 - 18:55:34 CST