Re: i18n of abiword -- combining characters


Subject: Re: i18n of abiword -- combining characters
From: Pierre Abbat (phma@oltronics.net)
Date: Sun Jan 16 2000 - 19:03:19 CST


I just pulled out Michael Coulson's Sanskrit book and counted 106 non-trivial
ligatures (ligatures not formable by removing danda from previously defined
consonants or ligatures and changing r preceded by another consonant to a
slash). I'll probably get a different number next time I count them.

Positions 01-7f are occupied by the existing Unicodes. (71-7f in Devanagari are
blank, but these are occupied in Bengali by currency-related symbols, so I
can't use them.) 36 of the high-bit positions (which will print as Bengali if
you add the Devanagari offset, so we'll have to figure out how to handle them)
are the bare consonants. There isn't enough room left for 106 characters.

Many of the ligatures are peculiar to Sanskrit, which had words ending in a
variety of consonants that words don't end in in modern Indian languages, and
occur when the last letter of a word is joined to the first letter of the next,
which isn't done in modern Indian languages. But Unicode includes, besides
letters like vocalic L which are used only in Sanskrit, a series of letters
with dots under them which I suppose are used to represent Arabic or Urdu or
other foreign sounds, which I haven't included in my bare consonants.

How do I handle all these letters?

phma



This archive was generated by hypermail 2b25 : Sun Jan 16 2000 - 19:19:02 CST