Abiword internal represenations, paragraph boundaries and pf_frag / strux

From: Ben Martin <monkeyiq_at_users.sourceforge.net>
Date: Tue May 31 2011 - 03:07:49 CEST

Hi,

While digging around looking at change tracking I was looking into the
internal document model of abiword again and thought I'd send this
somewhat terse message to the list which may be of interest to new abi
hackers. Conversely, if there are any gross mistakes below, please
feel free to point them out to me :)

The below is skewed toward my goal at hand: tracking explicitly the
start and end of a paragraph element. Which translates to knowing when
a PTX_Block has its beginning and/or end deleted to merge with
previous or subsequent content.

Much of my understanding has been put together by RTFC. For
pf_fragment and strux layout, one might consider
ODe_AbiDocListener::populateStrux() which does a switch on the
getStruxType() of the PX_ChangeRecord_Strux and simulates an end block
using _closeBlock() when a PTX element other than the set Z is
encountered.

Z = {
  PTX_SectionFootnote,
  PTX_SectionEndnote,
  PTX_SectionAnnotation
};

In particular, when a PTX_Block is encountered, the old block, if open
is first closed. Concretely, it appears that a "paragraph" can be
considered to be the document content from a PFT_Strux/PTX_Block
marker to a PFT_Strux/{all PTX - set Z} marker.

The following trace might be of interest, it is done during a deletion
so the same position is encountered many times as content is deleted
and thus lower content moves upwards in the document. The abw file
fragment follows it, the trace was made by deleting from the "r" in
para1 through to the "r" in para3 inclusive.

Pos 5 a pf_frag of length 4, a fragment type PFT_Text.
Pos 7 a pf_frag of length 1, type PFT_Strux
                        strux type PTX_Block
Pos 8 a pf_frag of length 1, a fragment type PFT_Text.
Pos 9 a pf_frag of length 3, a fragment type PFT_Text.
Pos 12 a pf_frag of length 1, a fragment type PFT_Text.
Pos 13 a pf_frag of length 1, type PFT_Strux
                        strux type PTX_Block
Pos 13 a pf_frag of length 1, type PFT_Strux
                        strux type PTX_SectionTable
Pos 13 a pf_frag of length 1, type PFT_Strux
                        strux type PTX_Block
Pos 13 a pf_frag of length 7, type PFT_Text
Pos 13 a pf_frag of length 1, type PFT_Strux
                        strux type PTX_Block
Pos 14 a pf_frag of length 2, type PFT_Text
Pos 16 a pf_frag of length 3, type PFT_Text

The simplified abw fragment:

<section xid="4">
<p style="Normal" xid="5">
  <c revision="1">p</c>
  <c revision="1"/>
  <c revision="1,!2{font-weight:bold}{author:0}">ara1</c>
</p>
<p revision="1" style="Normal" xid="1">
  <c revision="1">p</c>
  <c revision="1"/>
  <c revision="1,!2{font-style:italic}{author:0}">ara</c>
  <c revision="1">2</c>
</p>
<p revision="4" style="Normal" xid="7">
  <c revision="1"/>
</p>
<table revision="4" xid="8">
  <cell revision="4" xid="9">
    <p style="Normal" revision="4" xid="10"><c revision="4">r1c1</c></p>
  </cell>
  <cell revision="4" xid="12">
    <p style="Normal" revision="4" xid="13"><c/></p>
  </cell>
  <cell revision="4" xid="15">
    <p style="Normal" revision="4" xid="16"><c/></p>
  </cell>
  <cell revision="4" xid="18">
    <p style="Normal" revision="4" xid="19"><c revision="4">r2c2</c></p>
  </cell>
</table>
<p revision="4" style="Normal" xid="6">
  <c revision="4">para2.B</c>
</p>
<p revision="1" style="Normal" xid="2">
  <c revision="1,!2{font-weight:bold}{author:0}">pa</c>
  <c revision="1,!2{font-weight:bold}{author:0}"/>
  <c revision="1">ra3</c>
</p>
</section>

Received on Tue May 31 03:08:03 2011

This archive was generated by hypermail 2.1.8 : Tue May 31 2011 - 03:08:03 CEST