Improving AbiWord's PT

From: Martin Sevior <msevior_at_gmail.com>
Date: Wed Jun 03 2009 - 03:26:21 CEST

Hi everyone and especially Joaquin!

(PS. congrats on your current gig with Google!)

                  Our GSoC student, Aditya Manthramurthy, has been
making very impressive progress on his project to improve AbiWord's
table layout algorithm. In fact he has identified and fixed the origin
of our n^2 growth of the table layout code. However in his
investigations he found that the layout code is actually a trivial
component of the amount of time it takes to insert a large table.

He then profiled our code discovered that the time is dominated by
"insertStrux(...)

------------------
Aditya Manthramurthy wrote:
> Hi Martin,
>
> I have isolated that most of the time taken seems to be in calls to various insertStrux funcs in for loop starting at src/text/fmt/xp/fv_View_cmd.cpp - line 3305. All the work to layout the table happens after this for loop. This means that the PieceTable needs to sped up. Not knowing much about the piece table makes it harder to profile that part of the code more (using just valgrind), without setting up clock_gettime()s between bits of code. So I am not really sure about how to go on with this.
>

Digging in deeper using valgrind tells that
pt_PieceTable::_realInsertStrux() takes up most of the insert table
time.

The screenshot below, shows the callees of the above function in the
bottom part of the right side. The list is sorted in descending order
of cost. [Here, cost seems to be proportional to time taken, afaik.]

<screenshot supressed>

The above dump is from a run of my branch of abiword, with an insert
of a 400x4 table (it took 20-23 secs with valgrind!). The problem is
very much out of the table layout code I guess. How do we go about
improving the piece table?

------------------------------------

Here is my response to Aditya:
 --------------------------------------------------------------
HI Aditya,
             OK, the AbiWord PieceTable has a well known n dependence
for insertions and deletions. I wrote this up on our wiki. You can
find it here.

http://www.abisource.com/wiki/AbiWordDevelopment#PieceTable_speedups.

I think we're hitting the issues described by Joaquin.

http://e98cuenc.free.fr/wordprocessor/piecetable.html

Every insertStrux, means we recalculate the offset vector of the
entire PiecetTable. Since the size of this vector increases with the
size of the table,
it means we get a n^2 dependence of the total insertion time.

Look at the code in:

src/text/ptbl/xp/pf_Fragments.cpp::cleanFrags()

That is being called on every insertion. Note also that
vecFrags grows with every insertion.

So I can see two ways to go forward. We can work around not actually
calling cleanFrags on every insertion or we can attempt to implement
the
more sophisticated PieceTable described by Joaquin.

I would rather attempt to do the more sophisticated PieceTable. It
would substantially improve AbiWord's performance for large documents.

BTW I'm going to forward this discussion to the abiword ML.

This is very interesting work Aditya!

Cheers

Martin

----------------------------------------------------------------------------------

So there it is folks. Maybe the time has come to attempt Joaquin's PieceTable.

Cheers!

Martin
Received on Wed Jun 3 03:26:40 2009

This archive was generated by hypermail 2.1.8 : Wed Jun 03 2009 - 03:26:40 CEST