Re: CJK patch (was Re: pango)

From: Theppitak Karoonboonyanan <theppitak_at_gmail.com>
Date: Mon Mar 21 2005 - 10:36:50 CET

On Mon, 21 Mar 2005 08:16:54 +0000, Tomas Frydrych
<tomasfrydrych@yahoo.co.uk> wrote:
>
> > Is this likely to be an issue for you? Are the line breaking rules for
> > Thai substantially different from those for English? If they are, would you be
> > able to briefly describe them? It would also help if you could send me a test
> > document with some representative Thai text and suggest some fonts that will
> > enable me to read it.
>
> Thai is too complex for us to support fully without Pango.

Well, not really, provided that appropriate API's are defined.
We have a collection of Thai implementation code ready for plugging at
http://libthai.sourceforge.net/ including text rendering, text input,
and word break.

However, Pango can help a lot by its core engine and API's as well as its
sub-engines for various scripts. (A third-party Thai language engine for
Pango, with word break support, is also provided out of libthai in the web
above.)

The nature of Thai script is that rule-based approaches to line breaking
cannot give sufficient accuracy, as there are too much ambiguities.
Most implementations use dictionary-based approaches, which are almost
100% accurate. Many NLP researches are still being done to achieve as
close to 100% as possible.

For libthai, dictionary approach is used. Is it possible to plug it into CJK
rules?

Regards,
-Thep.
Received on Mon Mar 21 10:37:30 2005

This archive was generated by hypermail 2.1.8 : Mon Mar 21 2005 - 10:37:32 CET