From: Daniel Glassey (danglassey-abi_at_ntlworld.com)
Date: Mon Nov 10 2003 - 15:57:38 EST
Hi Tomas, working on a reply but just forwarding this to list for now
because you sent it from the address you aren't subscribed from.
Thanks,
Daniel
-----Forwarded Message-----
From: Tomas Frydrych
Cc: abiword-dev_at_abisource.com
Subject: rendering API
Date: Sun, 09 Nov 2003 10:50:08 +0000
Hi Daniel,
> Define a 'decent' API ;)
Indeed, that has to be the first step before we start talking about
using Pango, Graphite, Uniscribe etc.
We need an abstract XP redering API such that any communication
between our layout classes and a rendering engine (RE) will be
carried through this API, without exception. The big question is how
to best encapsualte the RE in the light of how we process text. This
is an outline of the AW processing that falls into the rendering
category
A. Current processing overview
=======================
I. Piece Table: contains the raw Unicode text. The important aspect
of PT for rendering is the fact that the text is stored in non-
sequential manner. The base class for the RE should provide API for
PT access, i.e., shaper will typically require access to a string, we
have to create the string from the PT data, something like
RE::getUCS4StringFromDocPos(ucs4 * str, uint length, ....);
RE::getUtf8StringFromDocPos(utf8*, ...); etc.
However, converting the PT contents to sequential strings is
timeconsuming and needs to be kept to a minimum. It would be ideal
for us if the external RE did not expect input in a sequential
string, but rather would allow the user to provide text itterator
hook instead, something like
ucs4 getNthUCS4Char(uint n);
which the extrnal RE would use to itterate our PT.
II. Block Class: the block class splits text into runs and breaks it
into lines. As long as we are talking about the RE only doing shaping
and providing line-breaking info, the block as it stands does not
needs access to the RE (the line-breaking info is passed down to the
block from the text runs).
III. Line class: from rendering point of view, the line class is
responsible for BIDI reordering of its runs. For now, we do this by
directly accessing FriBidi from within the line class; this too
should be encapsulated into the RE API; however, we do not reorder
actual text, only the sequence of text runs, keeping runs uni-
directonal. We want something like
RE::bidiReorder(const fp_Line * pLine);
Again, it would be ideal if the external RE did not require a text
string, but would allow us to provide an itterator over character
types, something like
FribidiCharType getNthCharType(uint n);
IV. Text Run class
a. finds suitable line break points within itself
b. is responsible for shaping (glyph replacement, ligatures ...)
c. is responsible for drawing
regarding (b) the Text Run instance caches the shaped string for
future use (i.e., to avoid unnecessary shaping)
So the run class would need something like
RE::findBreakPoint(const ucs4 *);
Again, we might find it usefull if instead of ucs4 string we could
provide an itterator.
RE::shape(...)
RE::draw()
This is how it currently works, we may want to / have to adjust the
above processing sequence to make using an external render easier.
B. Fallouts of the current processing
==============================
I. Unicode compliance.
------------------------------
a. There is a Unicode-compliance issue with doing BIDI reordering on
lines, the Unicode algorithm does reordering on paragrahs. That is
fine for reordering static text, but adds substantial overhead in a
wordprocessor, particularly when the text is stored in a non-
sequential PT. However, if we continue doing reordering of runs
rather than text, it would be possible to move the BIDI processing
from fp_Line into fl_BlockLayout. Because of processing overhead I
would only do this if it is either necessary to interface with
external REs, or if it can be shown that reordering lines produces
different sequence than reordering blocks.
b. The Unicode shaping algorithm assumes linebreaking is done
_before_ shaping, but this creates real problems in a WYSIWYG
application because the unshapped text cannot be measured for width;
we currently break after shaping.
II. Shaping limitations
-------------------------------
Because we shape & cache approach we currently use in fp_TextRun
means that we can only shape using glyphs with explicit Unicode
codepoints; there are languages, such as Syriac, where there is only
one code point per letter, and the shaping depends on font technology
(OpenType, etc.), and this does not work. Partial solution would be
to chache glyph indices instead of chars and have graphics methods
for drawing with indices, but it would only make sense if we
interface with a shaper that can handle such advanced font
technologies.
II. Interfacing with a RE
--------------------------------
As should be clear from the the above, a RE provides some
functionality encapsulated in our graphics classes (drawing), and
some in our layout classes (breaking, shaping). If we are to retain
the current processing sequence, we need to be able to tap into the
break, itemize, shape and render components of the RE individually
from our XP code. While I reckon it would be easy to write a suitable
abstract encapsulation of the required API, it might be difficult to
actually implement it efficiently with a given RE (i.e., if the
shaper assumes intermediate data in complex propriatary form, we will
need to continually convert our internal data into its format and
back). Also, the RE might not allow us to do only shaping, etc,
because its internal algorithm is such that the various stages cannot
be separated (I got the impression that this could be the case with
the Graphite engine's little-by-little processing).
The other option is to change the internal processing to make it
easier to interface with a RE (and preferably get the RE designers to
meet us half way, i.e., the itterator hooks suggested above); after
trying the other approach with Pango, I think, this is the only
realistic way. The real question is to what extent this could be done
without a specific shaper in mind; this takes us back to the
portability question.
Tomas
This archive was generated by hypermail 2.1.4 : Mon Nov 10 2003 - 15:56:10 EST