Re: PATCH: Next 5291 speedup

From: Dom Lachowicz (domlachowicz@yahoo.com)
Date: Thu Oct 02 2003 - 09:14:00 EDT

  • Next message: msevior@physics.unimelb.edu.au: "Re: PATCH: Next 5291 speedup"

    With some minor changes, this patch has been
    committed.

    Nice work :)

    Dom

    --- Johnny Lee <typo_pl@hotmail.com> wrote:
    > Nice find. Your patch knocked import time from ~22
    > seconds to ~15 seconds on
    > my box.
    >
    > I've got 2 patches which drop import time by another
    > ~4 seconds altogether.
    >
    > Import time for the RTF spec is ~11 seconds now,
    > followed by many seconds
    > doing "Building document...".
    >
    > First patch, bug5291a.txt, knocks ~2 seconds from
    > the import time.
    >
    > Changes:
    >
    > - Building upon Robert's changes, I changed the code
    > from doing a linear
    > search to do a binary search in
    > UT_Vector::addItemSorted. The binary search
    > will return the spot where the new item should be
    > inserted.
    >
    > - The original linear search had a bug which would
    > not insert an item after
    > the only item in the vector (the vector always has a
    > default empty AP with
    > checlsum 0).
    >
    > - Made a common private method for doing binary
    > searches in UT_Vector since
    > the binarySearch method uses almost the same code.
    >
    > - Changed the createAP methods in pp_TableAttrProp
    > from sorting the vector
    > after each add to using the addItemSorted method.
    > Ditched the sortTable
    > method since code doesn't use it anymore.
    >
    >
    > Second patch, bug5291b.txt, comes from looking at
    > the profile output. I
    > noticed the following:
    >
    > Module Statistics for abiword.exe
    > ---------------------------------
    > Time in module: 15302.935 millisecond
    > Percent of time in module: 100.0%
    > Functions in module: 11530
    > Hits in module: 39325065
    > Module function coverage: 7.1%
    >
    > Func Func+Child Hit
    > Time % Time % Count
    > Function
    >
    ---------------------------------------------------------
    > 0.018 0.0 15302.866 100.0 1
    > PD_Document::readFromFile
    > 0.200 0.0 15251.752 99.7 1
    > IE_Imp_RTF::importFile
    > 0.011 0.0 15217.363 99.4 4
    > IE_Imp_RTF::_parseFile
    > 141.625 0.9 15216.265 99.4 105
    > IE_Imp_RTF::_parseText
    > 35.198 0.2 13122.018 85.7 90402
    > IE_Imp_RTF::ParseRTFKeyword
    > 105.277 0.7 12748.765 83.3 90403
    > IE_Imp_RTF::TranslateKeyword
    >
    VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
    > 2.502 0.0 5296.926 34.6 1
    > IE_Imp_RTF::HandleStyleDefinition
    > 1.599 0.0 5276.492 34.5 101
    > PD_Document::appendStyle
    > 4.997 0.0 5250.106 34.3 8282
    > pt_PieceTable::enumStyles
    > 766.147 5.0 4826.000 31.5 11406
    > UT_Vector::qsort
    > 1468.353 9.6 4059.150 26.5 5816405
    > compareStyleNames
    >
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > 31.705 0.2 3860.252 25.2 53563
    > IE_Imp_RTF::FlushStoredChars
    >
    > compareStyleNames is getting called a lot, along
    > with several other
    > style-related functions, including
    > pt_PieceTable::enumStyles.
    >
    > Tracking down the code revealed the source as
    > abi\src\text\ptbl\xp\pt_PT_Styles.cpp:
    >
    > //
    > // Diagonostic on Append...
    > //
    > const PD_Style * pdStyle = NULL;
    > const char * psdName =NULL;
    > UT_uint32 i = 0;
    > for(i=0; i<getStyleCount(); i++)
    > {
    > enumStyles(i,&psdName,&pdStyle);
    > xxx_UT_DEBUGMSG(("SEVIOR: Found %d style name %s
    > \n",i,psdName));
    > }
    >
    >
    > I wrapped this code in a #ifdef DEBUG and import
    > time dropped by another 2
    > seconds.
    >
    > J
    >
    >
    >
    >
    >
    > >From: Robert Wilhelm <robert.wilhelm@gmx.net>
    > >To: AbiWord <abiword-dev@abisource.com>
    > >Subject: PATCH: Next 5291 speedup
    > >Date: Tue, 30 Sep 2003 20:42:38 +0200
    > >
    > >After Johnny Lees cool checksum and binary search
    > patch, I profiled
    > >abiword again and we spent lot of time in qsort.
    > See following call
    > >tree:
    > >
    > >percent num
    > >cumulative calls
    > >28.15 21952 addIfUniqueAP
    > >27.46 5120 addAP
    > > qsort
    > >12.80 72M compareAP
    > >4.34 149M ppAttrProp:GetCheckSum
    > >
    > >As m_vecTableSorted is already sorted I changed
    > addAP to
    > >just use a linear search and insert the new AP at
    > the right place.
    > >Now addIfUniqueAP does no longer show up on the
    > profile radar,
    > >and the time for importing the RTF spec decreased
    > from 55s to 49s on my
    > >machine.
    > >
    > >Robert
    > ><< AP.patch >>
    >
    >
    _________________________________________________________________
    > Instant message with integrated webcam using MSN
    > Messenger 6.0. Try it now
    > FREE! http://msnmessenger-download.com
    > > diff -u af/util/xp/new/orig1/ut_vector.cpp
    > af/util/xp/ut_vector.cpp
    > --- af/util/xp/new/orig1/ut_vector.cpp Wed Oct 01
    > 06:01:53 2003
    > +++ af/util/xp/ut_vector.cpp Wed Oct 01 05:52:05
    > 2003
    > @@ -126,21 +126,11 @@
    >
    > UT_sint32 UT_Vector::addItemSorted(const void* p,
    > int (*compar)(const void
    > *, const void *))
    > {
    > -
    > if (!m_iCount) return addItem(p);
    >
    > - UT_uint32 i = 0;
    > - int icmp;
    > -
    > - icmp = compar(&p, &m_pEntries[i]);
    > -
    > - while ( icmp > 0 && i < m_iCount - 1)
    > - {
    > - i++;
    > - icmp = compar(&p, &m_pEntries[i]);
    > - }
    > -
    > - return insertItemAt( p, i );
    > + UT_sint32 slot = binarysearchForSlot(&p, compar);
    > +
    > + return insertItemAt( p, slot );
    > }
    >
    > UT_sint32 UT_Vector::addItem(const void* p,
    > UT_uint32 * pIndex)
    > @@ -262,6 +252,16 @@
    >
    > UT_uint32 UT_Vector::binarysearch(void * key, int
    > (*compar)(const void *,
    > const void *))
    > {
    > + UT_sint32 slot = binarysearchForSlot(key, compar);
    > +
    > + if ((slot == m_iCount) || (0 != (*compar)(key,
    > &m_pEntries[slot])))
    > + return -1;
    > + else
    > + return slot;
    > +}
    > +
    > +UT_uint32 UT_Vector::binarysearchForSlot(void *key,
    > int (*compar)(const
    > void *, const void *))
    > +{
    > UT_sint32 high = m_iCount;
    > UT_sint32 low = -1;
    > UT_sint32 probe;
    > @@ -277,10 +277,7 @@
    > high = probe;
    > }
    >
    > - if ((high == m_iCount) || (0 != (*compar)(key,
    > &m_pEntries[high])))
    > - return -1;
    > - else
    > - return high;
    > + return high;
    > }
    >
    > bool UT_Vector::copy(const UT_Vector *pVec)
    > diff -u af/util/xp/new/orig1/ut_vector.h
    > af/util/xp/ut_vector.h
    > --- af/util/xp/new/orig1/ut_vector.h Wed Oct 01
    > 06:02:47 2003
    > +++ af/util/xp/ut_vector.h Wed Oct 01 04:54:19 2003
    > @@ -121,6 +121,7 @@
    >
    > private:
    > UT_sint32 grow(UT_uint32);
    > + UT_uint32 binarysearchForSlot(void *key, int
    > (*compar)(const void *,
    > const void *));
    >
    > void** m_pEntries;
    > UT_uint32 m_iCount;
    > diff -u text/ptbl/xp/new/orig1/pp_TableAttrProp.cpp
    > text/ptbl/xp/pp_TableAttrProp.cpp
    > --- text/ptbl/xp/new/orig1/pp_TableAttrProp.cpp Wed
    > Oct 01 06:04:59 2003
    > +++ text/ptbl/xp/pp_TableAttrProp.cpp Wed Oct 01
    > 06:51:12 2003
    > @@ -88,7 +88,6 @@
    > pAP->setIndex(u); //$HACK
    > result =
    > (m_vecTableSorted.addItemSorted(pAP,compareAP) ==
    > 0);
    > }
    > - // sortTable();
    >
    > return result;
    > }
    > @@ -104,13 +103,19 @@
    > delete pNew;
    > return false;
    > }
    > +
    > + pNew->setIndex(u); //$HACK
    > +
    > if (pSubscript)
    > {
    > *pSubscript = u;
    > }
    > -
    > - pNew->setIndex(u); //$HACK
    > - m_vecTableSorted.addItem(pNew, NULL);
    > + else
    > + {
    > + // create default empty AP
    > + pNew->markReadOnly();
    > + m_vecTableSorted.addItem(pNew, NULL);
    > + }
    >
    > return true;
    > }
    > @@ -130,8 +135,8 @@
    >
    > pAP->markReadOnly();
    >
    > - sortTable();
    > -
    > + m_vecTableSorted.addItemSorted(pAP,compareAP);
    > +
    > *pSubscript = subscript;
    > return true;
    > }
    > @@ -150,12 +155,13 @@
    >
    > pAP->markReadOnly();
    >
    > - sortTable();
    > -
    > + m_vecTableSorted.addItemSorted(pAP,compareAP);
    > +
    > *pSubscript = subscript;
    > return true;
    > }
    >
    > +
    > bool pp_TableAttrProp::findMatch(const PP_AttrProp *
    > pMatch,
    > UT_uint32 * pSubscript) const
    > {
    > @@ -205,9 +211,4 @@
    > return (const PP_AttrProp
    > *)m_vecTable.getNthItem(subscript);
    > else
    > return NULL;
    > -}
    > -
    > -void pp_TableAttrProp::sortTable(void)
    > -{
    > - m_vecTableSorted.qsort(compareAP);
    > }
    > diff -u text/ptbl/xp/new/orig1/pp_TableAttrProp.h
    > text/ptbl/xp/pp_TableAttrProp.h
    > --- text/ptbl/xp/new/orig1/pp_TableAttrProp.h Wed
    > Oct 01 06:50:34 2003
    > +++ text/ptbl/xp/pp_TableAttrProp.h Wed Oct 01
    > 06:50:46 2003
    > @@ -53,7 +53,6 @@
    > UT_uint32 * pSubscript) const;
    >
    > const PP_AttrProp * getAP(UT_uint32 subscript)
    > const;
    > - void sortTable(void);
    >
    > protected:
    > UT_Vector m_vecTable;
    > diff -u text/ptbl/xp/new/orig1/pt_VarSet.cpp
    > text/ptbl/xp/pt_VarSet.cpp
    > --- text/ptbl/xp/new/orig1/pt_VarSet.cpp Wed Oct 01
    > 05:59:41 2003
    > +++ text/ptbl/xp/pt_VarSet.cpp Wed Oct 01 05:59:50
    > 2003
    > @@ -42,20 +42,10 @@
    >
    > // create a default A/P as entry zero in each AP
    > table.
    >
    > - PT_AttrPropIndex foo;
    > -
    > - if ( !m_tableAttrProp[0].createAP(&foo)
    > - || !m_tableAttrProp[1].createAP(&foo))
    > + if ( !m_tableAttrProp[0].createAP(NULL)
    > + || !m_tableAttrProp[1].createAP(NULL))
    > return false;
    > - ((PP_AttrProp
    > *)getAP(_makeAPIndex(0,0)))->markReadOnly();
    > - ((PP_AttrProp
    > *)getAP(_makeAPIndex(1,0)))->markReadOnly();
    >
    > - //$TODO Are the two following calls really needed
    > or
    === message truncated ===> diff -u
    text/ptbl/xp/new/orig1/pt_PT_Styles.cpp
    > text/ptbl/xp/pt_PT_Styles.cpp
    > --- text/ptbl/xp/new/orig1/pt_PT_Styles.cpp Wed Oct
    > 01 06:11:21 2003
    > +++ text/ptbl/xp/pt_PT_Styles.cpp Wed Oct 01
    > 05:42:48 2003
    > @@ -322,6 +322,8 @@
    > //
    > if (pStyle)
    > m_hashStyles.insert(szName,(void *)pStyle);
    > +
    > +#ifdef DEBUG
    > //
    > // Diagonostic on Append...
    > //
    > @@ -333,6 +335,7 @@
    > enumStyles(i,&psdName,&pdStyle);
    > xxx_UT_DEBUGMSG(("SEVIOR: Found %d style name %s
    > \n",i,psdName));
    > }
    > +#endif
    > return true;
    > }
    > }
    >
    >

    __________________________________
    Do you Yahoo!?
    The New Yahoo! Shopping - with improved product search
    http://shopping.yahoo.com



    This archive was generated by hypermail 2.1.4 : Thu Oct 02 2003 - 09:29:31 EDT