More 5291 speedups

From: Johnny Lee (typo_pl@hotmail.com)
Date: Fri Oct 03 2003 - 01:25:47 EDT

  • Next message: Martin Sevior: "Re: More 5291 speedups"

    Summary:

    There are several speedups here and some size reduction with these changes.

    Import and building the RTF doc has been reduced from 23 seconds to ~15
    seconds total.

    Some of the changes may not be acceptable since I'm concentrating on
    speeding up loading the RTF spec and I haven't been able to determine if
    some of my assumptions are valid.
    I'm just proposing a solution. It may not be the best solution since I
    barely know the code and the coding style. I did increase the size of
    fp_Container objects by 4 bytes and the size of UT_Stringbuf objects by 4
    bytes. Feel free to code up a different solution.

    The code for change #3 is in the attached bug5291e.txt.
    The code for change #1, #2, and #4 is in bug5291f.txt.

    Change #1
    ==============
    After profiling Abiword with all the previous 5291 patches, the function
    listed at the top was fp_TableContainer::getCellAtRowColumn.

    Looking at the code for this function, there's a linear search thru the
    cells in a table to find the cell in the appropriate spot.

    When I dumped the position portion of the cell objects, they turned up in
    sorted order, from top to bottom, left to right.

    *** If this assumption is not correct, then the patch will probably fail to
    work correctly. ***

    Using that sorted array, I replaced the linear search with a binary search.

    This saved about 4 seconds on building the document.

    Change #2
    ==============
    Profiling the new version, the top function listed was
    fp_Container::clearBrokenContainers. This is a recursive function which is
    called millions of times while building the document.

    It took me a long time to figure out the function of 'broken containers'.

    Robert Wilhelm had a patch so that clearBrokenContainers would only write to
    the m_pMyBrokenContainer field if it was non-NULL.

    I added code to see how often the code NULLed out this field.

    Less then 20 times for the RTF spec! Millions of calls to NULL a variable
    less than 20 times.

    The reason it has to do this recursively is that the parent container
    doesn't know anything about broken containers in its children.

    And how many broken containers are there?

    fp_Container::setMyBrokenContainer is only called about 120 times.

    So I added a field to fp_Container which represents the number of broken
    containers in the current container and any children.

    When fp_Container::setMyBrokenContainer is called, the code goes from the
    current container up to its parent, grandparent, etc. and increments the
    count of broken containers in each of them.

    When fp_Container::clearBrokenContainers is called, if m_pMyBrokenContainer
    is cleared, the broken--container-count is decremented in the current
    container, and in its parent and grandparent, etc.

    When it loops through the children containers, the code checks their
    broken-container-count. If it is zero, then we don't bother calling
    clearBrokenContainers on them.

    This saves a couple of seconds building the RTF spec.

    Change #3
    ==============
    Profiling again, showed the following functions near the top:

    operator delete
    UT_String::UT_String
    UT_StringPtrMap::find_slot
    UT_StringPtrMap::pick

    Looking at the code, UT_StringPtrMap::pick creates a UT_String object which
    allocates space for the string, copies the string, and gets passed to
    UT_StringPtrMap::find_slot. Afterwards, the UT_String is destroyed which
    includes freeing the allocated string.

    I modified UT_Stringbuf and UT_String so that you can pass in a
    null-terminated string and no memory is allocated for the string, the code
    just references the string passed in - ideal when used for a read-only
    string, such as checking if something is in a hash table.

    My code changes add a few checks to the base string classes so that you
    can't modify the external string. You may want to come up with another
    solution that doesn't involve such extensive changes in the class. The idea
    is to get rid of the memory allocation/deallocation, etc. when the string is
    only examined. There are other places in ut_hash.cpp where this occurs as
    well.

    Change #4
    ==============
    Finally, a patch that deals more with reducing size than speed. I noticed in
    fp_TableContainer.cpp that there's lotsa code similar to this:

             getNthRow(i)->requisition = 0;
             getNthRow(i)->allocation = 0;
             getNthRow(i)->spacing = m_iRowSpacing;
             getNthRow(i)->need_expand = 0;
             getNthRow(i)->need_shrink = 0;
             getNthRow(i)->expand = 0;
             getNthRow(i)->shrink = 0;

    I looked at the disassembly of the code. The compiler didn't realize that
    the result from getNthRow(i) was the same for every line in this block of
    code. So the compiler generated code which would calculate the result for
    each statement.

    I created a variable and assigned the result from getNthRow/getNthCol to the
    variable and used the variable instead.

    This saved about 16 bytes each time with VC6.

    _________________________________________________________________
    Instant message during games with MSN Messenger 6.0. Download it now FREE!
    http://msnmessenger-download.com







    This archive was generated by hypermail 2.1.4 : Fri Oct 03 2003 - 01:41:20 EDT