Re: Speeding up AbiWord for very large documents

From: Martin Sevior <msevior_at_gmail.com>
Date: Wed Mar 21 2012 - 04:58:25 CET

Hi Diana,

To get a quick understand of how abiword works have a read of this
document from page 8 onwards

http://www.abisource.com/papers/guadec4/guadec-4.pdf

Then read this part of our wiki about the large document issues

http://www.abisource.com/wiki/AbiWordDevelopment

The FL_DocLayout classes holds a description of the document as
rendered on the screen. There can be more than of these perdocument
which is why we have several windows onto the same document open at
any thing.

All of the slowness with large document bugs have been fixed expect
the "breakSection". BreakSection's job is to determine which to place
a particular line. Whenever you create or delete a line in your
document, breakSection is called to work out where to place at the
lines.

So if you're at the start of a 1000 document and press return, you
create a new line in the document and the positions of all lines after
this line in the document changed. breakSection iterates through the
entire document working out where these lines need to be placed. This
is a process that is linear in time with document size.

The project is to either invent a better algorithmn that speeds up
this processes or to see if we can live "just-in-time" layout so that
abiword only lays out a page we need to view it or to see if can put
breakSection into the background so the user doesn't notice it working
to lay everything out. Right now breakSection blocks user input until
it has finished updating the entire document.

Cheers

Martin

On Wed, Mar 21, 2012 at 10:57 AM, Diana Maria Prajescu
<diana.prajescu@gmail.com> wrote:
>
> Hello,
>
> I have started reading about AbiWord and looking through the code in
> order to understand how it's structured and I am a bit confused
> regarding some things. Please correct me if I'm wrong.
>
> A document is represented by the FL_DocLayout class and this class is
> the root of two hierarchies: one corresponds to the document's content
> and one corresponds to the document's layout. The document's content
> is a list of sections (fl_SectionLayout) which contains one or more
> blocks (fl_BlockLayout). For the layout hierarchy a document has one
> or more pages (fp_Page), a page has one or more columns (fp_Column), a
> column has one or more lines (fp_Line), a line contains one or more
> runs (fp_Run).
>
> The _breakSection() function iterates through sections and assembles
> the document's content into pages, columns etc. so when a line
> modifies the section in which the line is must be iterated again
> beginning with the line's position. What determines a section's size?
> Is it possible that when modifying the layout for a section the next
> section's layout must be modified too?
>
> In the project idea's description it says that we can either speed up
> _breakSection() or to run it in the background, making only onscreen
> changes. I my opinion making only onscreen changes and the other in
> the background will bring a greater increase in speed.
>
> --
> Regards,
> Diana Prajescu
Received on Wed Mar 21 04:58:40 2012

This archive was generated by hypermail 2.1.8 : Wed Mar 21 2012 - 04:58:40 CET