From: Tomas Frydrych (tomas@frydrych.uklinux.net)
Date: Tue Apr 29 2003 - 15:29:38 EDT
This is to bring everyone up to speed on the bidi functionality
A. STABLE
Bidi in stable is badly broken (has been since 1.0.3; my fault
entirely), and I have no intention of fixing it; the bidi code is too
different from that used in HEAD and I simply do not have the time
for this (in addition, the bidi functionality was never working well
enough in STABLE for it to be broadly usable). Consequently, I will
change status of all bidi bugs filed against stable to 'wont fix'
asking the reporters to use HEAD instead. A note to this effect
should be added to the download page, which should contain links to
the precompiled win32 daily snapshots of HEAD.
B. HEAD
1. General Progress
A good progress has been made on the development code and the major
bugs associated with the new (more efficient) code introduced after
1.0.x have now been fixed (I think and hope); essentially, 2.0 will
be a usable bidi wordprocessor.
2. Arabic Support
The basic Arabic shaping engine is, thanks to help from a Lebanese
user, working reasonably well (basic shaping, two character
ligatures, combining diacritics). This should make AbiWord usable for
basic Arabic wordprocessing (and hopefully draw more feedback, and
perhaps developers in the future). Apart from essential bug fixing, I
do not intend to develop the shaping capabilities any further -- this
is really a job for a third party library after the 2.0 release.
3. Import/Export
a. Text Imp/Exp
All our importers can now handle Unicode based explict overrides
(although some fixes are still on my list); our text exporters do so
as well. Correct export/import of the paragraph direction is not
handled yet (next on my list).
b. MS Word
MS Word importer is getting better, but there are still issues,
particularly with importing numbers, which need to be fixed before
the 2.0 release (next on my list).
c. RTF
I am not sure about the state of the RTF imp/exporter, and this needs
to be working well for the 2.0 release.
C. Unicode Compliance
1. The intention is to follow the the Unicode bidi algorithm as close
as possible. With the recently added support for the LRO/RLO/LRM/RLM
characters we are nearly there, being somewhere between what the
Unicode specs call 'implicit bidirecitonality' and 'full
bidirectionality', from the latter we are separated by (2) and (3)
below.
2. We do not support LRE/RLE characters, and I do not currently
expect this to be in place for 2.0, as this is nowhere near as
important as the other outstanding issues, and depends on (3) below.
3. The Unicode algorithm prescribes that the embeding levels have to
be resolved on an entire paragraph, prior to line breaking. The lines
are then broken and indiviual lines are reordered according to the
embeding levels. At present we do line breaking first and then
reorder individual lines. This approach is more efficient, since when
text is modified we only need to process the affected lines, not the
entire paragraph. The only situation in which this sequence would
produce different results, AFAIK, is when the text contains
LRO/LRE/RLO/RLE characters. Since we do not use LRO/RLO internally,
this will only become an issue when implementing the LRE/RLE support.
I think the Unicode processing order could be implement with only a
small performance loss, but will not know until I try it. It would,
however, require a number of changes to block/line/run classes that
are likely to introduce a new set of bugs -- I do not want to do this
prior to 2.0, to avoid the situation we have with the present STABLE
(as far as bidi is concerned, that is).
D. Pre-2.0 work
Essentially, the bulk of what urgently needs to be done prior to 2.0
is fixing up the imp/exp code, so that at the least text imp/exp, MS
Word imp and RTF imp/exp are rock solid. For this I need help from
bidi users in testing and bug-filling, so please do not be shy in
filling new bugs (please add 'bidi' to the keywords, and make the
sample docs as short as possible).
Tomas
This archive was generated by hypermail 2.1.4 : Tue Apr 29 2003 - 15:42:52 EDT