Re: i18n of abiword -- combining characters


Subject: Re: i18n of abiword -- combining characters
From: Kevin Vajk (kvajk@ricochet.net)
Date: Sat Jan 15 2000 - 15:10:01 CST


On Sat, 15 Jan 2000, Paul Rohr wrote:

> Some languages may indeed have code points for all the composite glyphs
> needed. However, as far as I can tell, this is *not* true for Thai.

It's even worse than it looks from this chart. :(

In Thai, you can think of a vowel as being associated with the
consonant it is pronounced after. So, if the word is "nahm",
the "ah" vowel is associated with the "n" consonant. But the
vowel is not necessarily written after the consonant, it may be
written above or below it, as you can see from the chart.

But it may also be written *before* the consonant, as is the
case for 0E43. So, as a specific example, 0E43 is pronounced
"ai", and 0E19 is pronounced "n", so 0E43-0E19 is pronounced
"nai". Make sense?
It might be worthwhile to store it as 0E19,0E43, but know
to display it in the reverse order. I don't know. :(
I mean, if I wanted to search my document for words like
"n*", I'd want "nai" to show up, I think.

There are also distinct vowels not represented in this chart,
formed by combining some of the other vowel symbols around
a consonant.
For example, 0E4D+0E45=0E33, right? (This vowel is written
after its consonant.) But there's also 0E40+0E30, with the
consonant *between* them. On Thai vowel charts in schools,
this is a unique vowel, considered to be written "around"
the consonant, but it's not shown in this table.

By the way, Cambodian is almost exactly the same as Thai
in this respect. I expect there are others, too.

- Kevin Vajk
  <kvajk@ricochet.net>



This archive was generated by hypermail 2b25 : Sat Jan 15 2000 - 14:56:50 CST