Re: CJK line breaking

From: Roland Kay <roland.kay_at_ox.compsoc.net>
Date: Mon Mar 07 2005 - 15:43:42 CET

Attached is a prototype patch that implements Chinese line breaking, hopefully
without breaking western line breaking. It is active whatever locale the user is
running in. This allows people in London to write in Chinese, for example.

There are a couple of issues.

1,

Previously the part of the following statement following "&&" was commented
out. I had to reenable it because otherwise with the Chinese line breaking Abi
would try to split runs after the last character of the document. This lead
to a fatal assert. I don't know what the consequences of enabling this line
might be though.

fp_TextRun.cpp:567
------------------
 if (bForce || iNext == (UT_sint32)i || bCanBreak
     && ((i + offset) != (getBlockOffset() + getLength() - 1)))// KAY: Enabled {

2,

Now loading the attached document causes the following assert to occur twice:

 **** (1) Assert ****
 **** (1) iMaxWidth <= getPage()->getWidth() at fp_Line.cpp:2855 ****
 **** (1) Continue ? (y/n) [y] :

although if I respond with "Y" the document loads with no apparent problems.
Unless the new line breaking rules are exposing existing bugs, I don't really
see why returning different values of true/false from the GR_Graphics::canBreak
function should lead to this error. Alternatively, it might be one of the small
changes outside of that function. Eg: 1, above, or the fact that I commented out
the calls to text.setUpperLimit as Tomas suggested.

The document loads OK in an unmodified version of abi, but then the characters
will be in different positions so it's not a very good comparison.

3, I don't know anything about Korean or Japanese so at the moment the Korean
and Japanese line breaking is "a la Chinese".

Please take a look and tell me what you think. I'm sorry if the code is a little
messy. I've tried to keep it fairly clean.

Best wishes,

R.
Received on Mon Mar 7 15:47:21 2005

This archive was generated by hypermail 2.1.8 : Mon Mar 07 2005 - 15:47:21 CET