Re: commit -- Working STL vector support


Subject: Re: commit -- Working STL vector support
From: Thomas Fletcher (thomasf@qnx.com)
Date: Mon Dec 18 2000 - 20:47:17 CST


On Mon, 18 Dec 2000, Dom Lachowicz wrote:

> Thomas wrote:
> >STL! OK there is definitely something that I'm not understanding
> >here. Why is the vector so slow ... Where is the vector slow?
> >Just blindly replacing it with the STL version doesn't make
> >_any_ sense to me. I'd like to know what in particular is
> >slowing us down and then fix that. I'm quite happy right now that
> >I don't have any dependancy on libstc++ for QNX ... adding STL
> >will break that for QNX at least.
>
> I guess it is that using the STL is an "easy-way out" to replace the
> slowness of our STL-like classes. "Blind replacement" is the easy option,
> and it is technically sound. I see no reason why we couldn't have STL
> implementations for all of our class interfaces. Much thought and
> development has gone into the STL and the C++ template system in general.
> Not quite as much thought has gone into our UT_XXX classes.

I agree that much thought has gone into STL templates ... and they can
be really wonderful if used properly. On the other hand they can also
be abused like anything else =;-) and I _hate_ when people just "fix"
something for the sake of fixing ... especially if we don't know where
the problem is. Isn't it you who was advocating a "fix the problem, not
mask the symptoms" approach earlier =;-).

 
> This doesn't mean that our classes are bad or that we should abandon them.
> Our UT_XXX classes could definitely use much improvement in their
> implementation and we should work to improve them. But I don't see any real
> harm in conditionally compiling STL-based versions of our classes if it
> helps a few people have a better experience with AbiWord. However, we should
> not make STL the default. An ABI_OPT_STL=1 flag should define something like
> -DUSING_STL to our compiler. STL-aware classes should have something like
> #ifdef USING_STL to condidionally compile in STL support. Abi should be able
> to run flawlessly on platforms where there is no STL support. We should not,
> however, deny ourselves the opportunity to run better on those platforms
> where it is possible to.

I took a few moments this evening to put together the following test for
Neutrino to look at the insertion speed of our vector class. Here is
what I wrote to test it (headers etc removed):

#define CYCLES 10
#define ITERATIONS 50000

int main(int argc, char **argv) {
        uint64_t start, stop, diff, cps;
        double msec;
        int i, j;

        cps = SYSPAGE_ENTRY(qtime)->cycles_per_sec / 1000;
        printf( "System runs @ %lld cycles/sec\n",cps );
        printf( "Running %d loops of %d iterations\n", CYCLES, ITERATIONS);

        /* Take a rough benchmark of the time to insert elements */
for(j = 0; j < CYCLES; j++) {
        UT_Vector vec;

        start = ClockCycles();
        for (i=0; i<ITERATIONS; i++) {
                vec.addItem(&argv[0]);
        }
        stop = ClockCycles();

        if (start > stop) {
                printf("Overflow ... test again \n");
        } else {
                diff = stop - start;
                msec = (double)diff / (double)cps;
                printf("%lld cycles %.2lf ms \n", diff, msec);
        }
}
        return 0;
}

Now here are my results:

Initial test with original vector class:
System runs @ 350847 cycles/sec
Running 10 loops of 50000 iterations
1067721053 cycles 3043.27 ms
1065270144 cycles 3036.28 ms
1064068227 cycles 3032.86 ms
1062944582 cycles 3029.65 ms
1062888058 cycles 3029.49 ms
1062475271 cycles 3028.32 ms
1063600104 cycles 3031.52 ms
1064628594 cycles 3034.45 ms
1063410321 cycles 3030.98 ms
1063126250 cycles 3030.17 ms

Then I went into the vector class and re-wrote the array growing code
to use realloc() instead of malloc free and at the same time got
rid of a function call that was being used in one location to
determine how much to grow the array. Re-running the tests I got:

Running with my two simple optimizations:
System runs @ 350847 cycles/sec
Running 10 loops of 50000 iterations
10830486 cycles 30.87 ms
10029181 cycles 28.59 ms
9580695 cycles 27.31 ms
9345701 cycles 26.64 ms
9486645 cycles 27.04 ms
10394098 cycles 29.63 ms
11091530 cycles 31.61 ms
9466747 cycles 26.98 ms
9293571 cycles 26.49 ms
9280968 cycles 26.45 ms

Not bad for a couple lines of code tweaking ... a factor of ten
on this large list. Of course there is a comment that indicates
that the behaviour of the class should probably be tuned to
the number of elements (the way the allocator works is that it
doubles the allocated space up to a maximum and then grows by
a fixed amount from that point on). Using the fact that I
can overload the constructor with a default value I "predicted"
that I would have 50000 elements in my vector. This gave
the following results on the _existing_ class:

Original with a prediction of 50000 elements:
System runs @ 350847 cycles/sec
Running 10 loops of 50000 iterations
6809282 cycles 19.41 ms
5588782 cycles 15.93 ms
5877829 cycles 16.75 ms
6085929 cycles 17.35 ms
6203381 cycles 17.68 ms
6691463 cycles 19.07 ms
5620579 cycles 16.02 ms
5543573 cycles 15.80 ms
5151255 cycles 14.68 ms
5570098 cycles 15.88 ms

Then when coupled with my changes we see that the improvement
for the copying just gets lost in the noise if you can do
a half decent job of predicting how large the data set will
be.

Optimized version with prediction of 50000 elements:
System runs @ 350847 cycles/sec
Running 10 loops of 50000 iterations
5954521 cycles 16.97 ms
5027151 cycles 14.33 ms
4173265 cycles 11.89 ms
5411753 cycles 15.42 ms
5249518 cycles 14.96 ms
4523883 cycles 12.89 ms
4633118 cycles 13.21 ms
4662774 cycles 13.29 ms
4593791 cycles 13.09 ms
4240069 cycles 12.09 ms

I've attached the changes that I've made, and if there are no
objections then I'll commit them. I'd be curious to see how
the STL version does to see if the raw numbers are still
better (which they very well might be).

Thomas

/* AbiSource Program Utilities
 * Copyright (C) 1998 AbiSource, Inc.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version 2
 * of the License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
 * 02111-1307, USA.
 */
 
#ifndef UTVECTOR_H
#define UTVECTOR_H

#include "ut_types.h"
#include "ut_assert.h"

#define ABI_OPT_STL 0

// TODO change the 'int' types to 'UT_[su]int32' whichever is appropriate.

// ----------------------------------------------------------------
/*
        The following class is a simple, portable implementation of a vector.
        Following in Mozilla's footsteps, we don't use STL because templates
        are not yet portable enough for our needs. (Same goes for exceptions
        and namespaces, BTW.)
*/

#if (!ABI_OPT_STL)
class UT_Vector
{
public:
        /* TF CHANGE - Third optimization:
         If we have any idea on how many elements we are going to have then we should
     set this appropriately.
        */
        UT_Vector(int sizehint = 2048);
        ~UT_Vector();

        UT_sint32 addItem(void*);
        UT_sint32 addItem(void* p, UT_uint32 * pIndex);
        void* getNthItem(UT_uint32 n) const;
        const void* operator[](UT_uint32 i) const;
        UT_sint32 setNthItem(UT_uint32 ndx, void * pNew, void ** ppOld);
        void* getFirstItem() const;
        void* getLastItem() const;
        UT_uint32 getItemCount() const;
        UT_sint32 findItem(void*) const;

        UT_sint32 insertItemAt(void*, UT_uint32 ndx);
        void deleteNthItem(UT_uint32 n);
        void clear();
        void qsort(int (*compar)(const void *, const void *));

        UT_Bool copy(UT_Vector *pVec);

protected:
        UT_uint32 calcNewSpace();
        UT_sint32 grow(UT_uint32);
        
        void** m_pEntries;
        UT_uint32 m_iCount;
        UT_uint32 m_iSpace;
        UT_uint32 m_iCutoffDouble;
        UT_uint32 m_iPostCutoffIncrement;
};

// NB: this macro is useful only in destructors
#define UT_VECTOR_PURGEALL(d, v) \
        do { int utv_max = v.getItemCount(); \
                        for (int utv=utv_max-1; utv>=0; utv--) \
                        { \
                                d utv_p = (d) v.getNthItem(utv); \
                                UT_ASSERT(utv_p); \
                                if (utv_p) \
                                        delete utv_p; \
                        } \
        } while (0)

// NB: this macro is useful only in destructors
#define UT_VECTOR_SPARSEPURGEALL(d, v) \
        do { int utv_max = v.getItemCount(); \
                        for (int utv=utv_max-1; utv>=0; utv--) \
                        { \
                                d utv_p = (d) v.getNthItem(utv); \
                                if (utv_p) \
                                        delete utv_p; \
                        } \
        } while (0)

// NB: this macro is useful only in destructors
#define UT_VECTOR_FREEALL(d, v) \
        do { int utv_max = v.getItemCount(); \
                        for (int utv=utv_max-1; utv>=0; utv--) \
                        { \
                                d utv_p = (d) v.getNthItem(utv); \
                                UT_ASSERT(utv_p); \
                                if (utv_p) \
                                        free(utv_p); \
                        } \
        } while (0)

#else /* ABI_OPT_STL */

#include <vector>
#include <algorithm>

class UT_Vector
{
 public:
        UT_Vector();
        ~UT_Vector();

        UT_sint32 addItem(void*);
        UT_sint32 addItem(void* p, UT_uint32 * pIndex);
        void* getNthItem(UT_uint32 n) const;
        const void* operator[](UT_uint32 i) const;
        UT_sint32 setNthItem(UT_uint32 ndx, void * pNew, void ** ppOld);
        void* getFirstItem() const;
        void* getLastItem() const;
        UT_uint32 getItemCount() const;
        UT_sint32 findItem(void*) const;

        UT_sint32 insertItemAt(void*, UT_uint32 ndx);
        void deleteNthItem(UT_uint32 n);
        void clear();
        void qsort(int (*compar)(const void *, const void *));

        UT_Bool copy(UT_Vector *pVec);
 private:
        vector<void *> m_STLVec;


};

// NB: this macro is useful only in destructors
#define UT_VECTOR_PURGEALL(d, v) \
        do { int utv_max = v.getItemCount(); \
                        for (int utv=utv_max-1; utv>=0; utv--) \
                        { \
                                d utv_p = (d) v.getNthItem(utv); \
                                UT_ASSERT(utv_p); \
                                if (utv_p) \
                                        delete utv_p; \
                        } \
        } while (0)

// NB: this macro is useful only in destructors
#define UT_VECTOR_SPARSEPURGEALL(d, v) \
        do { int utv_max = v.getItemCount(); \
                        for (int utv=utv_max-1; utv>=0; utv--) \
                        { \
                                d utv_p = (d) v.getNthItem(utv); \
                                if (utv_p) \
                                        delete utv_p; \
                        } \
        } while (0)

// NB: this macro is useful only in destructors
#define UT_VECTOR_FREEALL(d, v) \
        do { int utv_max = v.getItemCount(); \
                        for (int utv=utv_max-1; utv>=0; utv--) \
                        { \
                                d utv_p = (d) v.getNthItem(utv); \
                                UT_ASSERT(utv_p); \
                                if (utv_p) \
                                        free(utv_p); \
                        } \
        } while (0)

#endif /* ABI_OPT_STL */

#endif /* UTVECTOR_H */

/* AbiSource Program Utilities
 * Copyright (C) 1998 AbiSource, Inc.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version 2
 * of the License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
 * 02111-1307, USA.
 */
 


#include <stdlib.h>

// TODO change the 'int' types to 'UT_[su]int32' whichever is appropriate.

#include "ut_types.h"
#include "ut_vector.h"
#include "ut_assert.h"

#if (!ABI_OPT_STL)

UT_Vector::UT_Vector(int sizehint)
{
        m_iCutoffDouble = sizehint; /* After this point we stop doubling our allocations */
        m_iPostCutoffIncrement = 32; /* We only increment the array by this much after our allocations */
        m_iCount = 0; /* The number of slots we have filled */
        m_iSpace = 0; /* The number of slots we have allocated */
        m_pEntries = NULL; /* The actual array of pointers itself */
}

void UT_Vector::clear()
{
        m_iCount = 0;
        m_iSpace = 0;
        FREEP(m_pEntries);
        m_pEntries = NULL;
}

UT_Vector::~UT_Vector()
{
        FREEP(m_pEntries);
}

UT_uint32 UT_Vector::calcNewSpace(void)
{
        if (m_iSpace < m_iCutoffDouble)
        {
                if (m_iSpace > 0)
                {
                        return m_iSpace * 2;
                }
                else
                {
                        return m_iPostCutoffIncrement;
                }
        }
        else
        {
                return m_iSpace + m_iPostCutoffIncrement;
        }
}

UT_uint32 UT_Vector::getItemCount() const
{
        return m_iCount;
}

/*
 This function is called everytime we want to insert a new element but don't have
 enough space. In this case we grow the array to be _at least_ ndx size.
*/
UT_sint32 UT_Vector::grow(UT_uint32 ndx)
{
        /* TF CHANGE - Second Optimization:
     We only make this call from here when we grow the array so avoid the
     function call overhead and bring the code in here.
    */
#if defined(SECOND_OPTIMIZATION)
        UT_uint32 new_iSpace;
        if(!m_iSpace) {
                new_iSpace = m_iPostCutoffIncrement;
        }
        else if (m_iSpace < m_iCutoffDouble) {
                new_iSpace = m_iSpace * 2;
        }
        else {
                new_iSpace = m_iSpace + m_iPostCutoffIncrement;
        }
#else
        UT_uint32 new_iSpace = calcNewSpace();
        if (new_iSpace < ndx)
        {
                new_iSpace = ndx;
        }
#endif

        /*TF CHANGE - First Optimization:
     Let re-alloc help us out with the copy & free. This gains us
         performance of the system optimized calls as well as any built-in
         intelligence of growing buffers from realloc().
        */
#if defined(FIRST_OPTIMIZATION)
        void ** new_pEntries = (void **)realloc(m_pEntries, new_iSpace * sizeof(void *));
        if (!new_pEntries) {
                return -1;
        }
        m_iSpace = new_iSpace;
        m_pEntries = new_pEntries;
#else
        void ** new_pEntries = (void**) calloc(new_iSpace, sizeof(void*));
        if (!new_pEntries)
        {
                return -1;
        }

        if (m_pEntries && (m_iCount > 0))
        {
                for (UT_uint32 i=0; i<m_iCount; i++)
                {
                        new_pEntries[i] = m_pEntries[i];
                }

                FREEP(m_pEntries);
        }

        m_iSpace = new_iSpace;
        m_pEntries = new_pEntries;
#endif

        return 0;
}

UT_sint32 UT_Vector::insertItemAt(void* p, UT_uint32 ndx)
{
        if (ndx > m_iCount + 1)
                return -1;
        
        if ((m_iCount+1) > m_iSpace)
        {
                UT_sint32 err = grow(0);
                if (err)
                {
                        return err;
                }
        }

        // bump the elements -> thataway up to the ndxth position
        for (UT_uint32 i = m_iCount; i > ndx; i--)
        {
                m_pEntries[i] = m_pEntries[i - 1];
        }

        m_pEntries[ndx] = p;
        ++m_iCount;

        return 0;
}

UT_sint32 UT_Vector::addItem(void* p, UT_uint32 * pIndex)
{
        int err = addItem(p);
        if (!err && pIndex)
                *pIndex = m_iCount-1;
        return err;
}

UT_sint32 UT_Vector::addItem(void* p)
{
        if ((m_iCount+1) > m_iSpace)
        {
                UT_sint32 err = grow(0);
                if (err)
                {
                        return err;
                }
        }

        m_pEntries[m_iCount++] = p;

        return 0;
}

void* UT_Vector::getNthItem(UT_uint32 n) const
{
        UT_ASSERT(m_pEntries);
        UT_ASSERT(m_iCount > 0);
        UT_ASSERT(n<m_iCount);

        return m_pEntries[n];
}

UT_sint32 UT_Vector::setNthItem(UT_uint32 ndx, void * pNew, void ** ppOld)
{
        if ((ndx+1) > m_iSpace)
        {
                UT_sint32 err = grow(ndx+1);
                if (err)
                {
                        return err;
                }
        }

        if (ppOld)
        {
                *ppOld = m_pEntries[ndx];
        }
        
        m_pEntries[ndx] = pNew;
        if ((ndx+1) > m_iCount)
        {
                m_iCount = ndx + 1;
        }
        
        return 0;
}

void* UT_Vector::getLastItem() const
{
        UT_ASSERT(m_iCount > 0);

        return m_pEntries[m_iCount-1];
}

void* UT_Vector::getFirstItem() const
{
        UT_ASSERT(m_iCount > 0);
        UT_ASSERT(m_pEntries);

        return m_pEntries[0];
}

void UT_Vector::deleteNthItem(UT_uint32 n)
{
        UT_ASSERT(n < m_iCount);
        UT_ASSERT(m_iCount > 0);

        for (UT_uint32 k=n; k<m_iCount-1; k++)
        {
                m_pEntries[k] = m_pEntries[k+1];
        }
        
        m_pEntries[m_iCount-1] = 0;
        m_iCount--;

        return;
}

UT_sint32 UT_Vector::findItem(void* p) const
{
        for (UT_uint32 i=0; i<m_iCount; i++)
        {
                if (m_pEntries[i] == p)
                {
                        return (UT_sint32) i;
                }
        }

        return -1;
}

void UT_Vector::qsort(int (*compar)(const void *, const void *))
{
        ::qsort(m_pEntries, m_iCount, sizeof(void*), compar);
}

UT_Bool UT_Vector::copy(UT_Vector *pVec)
{
        clear();

        for (UT_uint32 i=0; i < pVec->m_iCount; i++)
        {
                UT_sint32 err;

                err = addItem(pVec->m_pEntries[i]);
                if(err == -1)
                        return err;
        }

        return 0;
}

const void* UT_Vector::operator[](UT_uint32 i) const
{
        return this->getNthItem(i);
}

#else /* ABI_OPT_STL */

UT_Vector::UT_Vector()
{
}

void UT_Vector::clear()
{
        m_STLVec.clear();
}

UT_Vector::~UT_Vector()
{
}

UT_uint32 UT_Vector::getItemCount() const
{
        return m_STLVec.size();
}

UT_sint32 UT_Vector::insertItemAt(void* p, UT_uint32 ndx)
{
        m_STLVec.insert(m_STLVec.begin()+ndx, p);
        return 0;
}

UT_sint32 UT_Vector::addItem(void* p, UT_uint32 * pIndex)
{
        int err = addItem(p);
        if (!err && pIndex)
                *pIndex = m_STLVec.size()-1;
        return err;
}

UT_sint32 UT_Vector::addItem(void* p)
{
        m_STLVec.push_back(p);

        return 0;
}

void* UT_Vector::getNthItem(UT_uint32 n) const
{
        return m_STLVec[n];
}

UT_sint32 UT_Vector::setNthItem(UT_uint32 ndx, void * pNew, void ** ppOld)
{
        if (ppOld)
        {
                *ppOld = m_STLVec[ndx];
        }
        
        m_STLVec[ndx] = pNew;
        return 0;
}

void* UT_Vector::getLastItem() const
{
        return m_STLVec[m_STLVec.size()-2];
}

void* UT_Vector::getFirstItem() const
{
        return m_STLVec[0];
}

void UT_Vector::deleteNthItem(UT_uint32 n)
{
        m_STLVec.erase(m_STLVec.begin()+n);
        return;
}

UT_sint32 UT_Vector::findItem(void* p) const
{
        for (UT_uint32 i=0; i<m_STLVec.size(); i++)
        {
                if (m_STLVec[i] == p)
                {
                        return (UT_sint32) i;
                }
        }

        return -1;
}

void UT_Vector::qsort(int (*compar)(const void *, const void *))
{
        sort(m_STLVec.begin(), m_STLVec.end(), compar);
}

UT_Bool UT_Vector::copy(UT_Vector *pVec)
{
        clear();

        for (UT_uint32 i=0; i < pVec->getItemCount(); i++)
        {
                m_STLVec.push_back(pVec->getNthItem(i));
        }

        return 0;
}

const void* UT_Vector::operator[](UT_uint32 i) const
{
        return m_STLVec[i];
}

#endif /* ABI_OPT_STL */



This archive was generated by hypermail 2b25 : Mon Dec 18 2000 - 20:45:19 CST