Subject: Thoughts on text styles, Show Paragraphs, and internal representation
From: Jesper Skov (jskov@redhat.com)
Date: Sat Jun 24 2000 - 15:07:26 CDT
Hi there
This mail is due to some concerns I've had for some time now over the
internal representation of the formatter (Runs/Blocks).
Caveats
~~~~~~~
o I don't know enough of how the fmt and ptbl stuff interacts. There
may be bad assumptions of how things work/can be made to work.
o I very much believe in argumenting by providing working patches,
but I live a life restricted to a mere 24 hours a day, so I need to
know there's at least a reasonable chance something will work (and
be accepted) before I want to spend time on it. So this is "just"
talk for now. RFC, more precisely. I'll appreciate constructive
comments, but stuff that just shoots this down (with arguments) are
_also_ useful :)
Motivation/background
~~~~~~~~~~~~~~~~~~~~~
OK, first my motivation so y'all know what's driving this. Several
issues which I believe are affected by this, which makes it more than
a trivial hack (and quite possibly beyond me):
o Code robustness - the work I did recently on the cursor location
code convinced me that the current mix of Runs which can contain
the point (cursor location) and Runs which cannot is a bad thing.
My changes may have fixed some problems, but it was very hard to
get the code to work as well as it does now (still not perfect,
actually). This means the code is fragile, and is bound to break
next time someone sneezes anywhere near it.
o Paul Rohr brought up the remaining implementation details of an old
POW, also pointing out some of the problems with zero-length Runs.
http://www.abisource.com/mailinglists/abiword-dev/00/June/0130.html
o Show Paragraphs (show codes?) [I'm not much of a WP user myself, so
this observation may not be correct]: if I enable show codes I
would expect to see a marker between Runs of different text
styles. E.g., <bold> or <font 20>. Currently, this information
lives in the Runs, not as explicit elements in a block, and thus
cannot easily be displayed. I feel that may become a problem (or
maybe I'm just a control freak, I don't know).
This introduces two limitations (that I can think of):
1) When typing we always inherit the text style from the text to
the left. So if entering text at the border between two styles,
and you want the style to the right, you need to manually change
it before you start to type.
I would love to be able to enable show codes, move the cursor
right over the formatting code(s) and start typing with the
style at the right. [I believe this was possible in WordPerfect
5, which is the last time I used a WP ;]
2) There is no way to distinguish between style changes inserted
automatically and those inserted explicitly by the user. The
latter type is when the user changes font size, for
example. The former could be when 'Insert Symbol' changes
font. This can cause problems, as described in Bug 903 (see
http://www.abisource.com/bugzilla/show_bug.cgi?id=903 )
If there were explicit formatting style codes in separate Runs
in the internal representation, a flag could tell whether the
cursor (and thus what's typed) should inherit the Run's style or
go back to a previous Run for that information.
OK, that was fairly structured, I hope. Still reading? Now I'm going
to pour my brain out of one ear (the left): implementation thoughts,
random comments and whatnot. The ride may get a bit rough...
Random thought #1
~~~~~~~~~~~~~~~~~
I'd like a preference to switch between showing codes as magic
graphical characters which hardcore WP users may understand, and
descriptive strings which the rest of the world understands
(e.g. <newline>, <linebreak>, <EOD>, ...).
Avoiding zero-length content Runs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<see http://www.abisource.com/mailinglists/abiword-dev/00/June/0198.html>
All break Runs can contain point. Computed cursor position depends on
IP offset and hide/show codes mode:
hide codes: nothing to render, impossible to select, but still
possible to place cursor just before the break (i.e., at
the offset of the Run).
show codes: renders whatever is configured for that break. Can be
selected (and thus deleted). Cursor can move past
rendered graphics/text. In other words, the break is
editable like any other text (and the world is a better
place for some of us :)
Avoiding zero-length formatting Runs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<only me to blame for this one>
Currently FMT Runs have a zero length. Changing this will:
o Allow display (and editing) of FMT codes [same hide/show codes
behavior as above]
o Avoids multiple Runs in a block having the same offset. It bothers
me, even though it may be benign.
o Removes the need for fancy cursor location search algorithm (all
Runs can contain point, so things get _much_ simpler).
Issues:
o How does/should this interact with the ptbl hierarchy? Is it
possible to keep this just in the fmt hierarchy? This is my
understanding (at what is the weakest link in my argumentation).
The FMT Runs need to be inserted at the proper places. This can be
handled by the block insert functions, if the info only lives in fmt
for the sake of (optional) displaying.
o As I sketch things, "<bold><font40>" will be atomic. The user will
not be able to delete "<bold>". Instead, the cursor should be
placed after the code (or the code should be selected) and the
appropriate (GUI) controls should be adjusted. Would that be
frustrating, I wonder?
Making it non-atomic, you have to handle insertion of text between
the formatting elements (i.e., create a new FMT, so you get
"<bold>hello world<font 40>"). Power users would probably expect
this behavior, but I don't think making the FMT atomic is
unreasonable.
o If non-atomic: Should stray FMT codes be left alone? Or always
cleaned up? I believe the latter is the right approach, since
they'll die when saving the document anyway.
[by this, I mean something like this: <bold><bold><font20><font40>X
which results in <bold><font40>X]
Added Bonus! (I think :)
~~~~~~~~~~~~
Instead of content Runs containing formatting information, they refer
to the relevant FMT Run.
Internal representation becomes a little leaner (from a quick naive
look, I think these get replaced with a single pointer: m_fPosition,
m_pFont, m_pFontLayout, m_iAscent, m_iDescent, m_iHeight,
m_iAscentLayoutUnits, m_iDescentLayoutUnits, m_iHeightLayoutUnits)
And juggling with the Runs becomes a bit cheaper CPU wise +
lookupProperties only needs doing once for an "ideal" Run, regardless
of whether it has been properly coalesced or not.
Cons
~~~~
o I don't know if this holds water. I need people with a better
understanding of abiword (and WP in general) to think it through.
And other people shouldn't get their hopes up - I suspect this baby
leaks water like waterfall.
o It'll take time to implement. Possibly more than is available
before 1.0.
Pros
~~~~
o Show _codes_ would work. Not just show _paragraphs_.
o There would be no magic zero-length-runs, nor any Runs which cannot
contain the point. Thus the now fragile cursor position computation
code would become simpler and more robust.
o Show paragraphs POW would be completed.
o Makes it possible to have an "atomic" combination of FMT and
content Runs such as used by 'Insert Symbol' without messing up the
text typed in by the user.
Hope I got at least a few brain cells firing maniacally :)
Cheers,
Jesper
This archive was generated by hypermail 2b25 : Sat Jun 24 2000 - 15:07:36 CDT