Re: 5291 RTF import slow - the solution :-)

From: Mike Nordell (tamlin@algonet.se)
Date: Wed Sep 24 2003 - 11:58:25 EDT

  • Next message: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"

    Hub wrote:

    > We are not trying to save a couple of bytes, but
    > trying to speedup lookups in the PieceTable, where the lookup are being
    > performed by comparing large keys (strings) were the keyspace is tighter
    > and defined (we have a defined and finite set of keys, aka properties).

    Right. So what about creating the kind of mapping I mentioned. When loading
    a document, every string would obviously have to be str(i)cmp'd once. But
    what if handing it off to some little function that returns a type which
    holds both the string and an integral value which that function derives from
    the string? Perhaps put them in a map (to get log(n) lookup) where that
    integer would be key and the real string would be the value. From that point
    on, everytime a lookup is needed, only a simple comparison between integers
    would be needed.

    Insertions of new strings into such a set would probably be something like:
    - Derive integer from new string.
    - If integer not found in existing map (using mentioned log(n) complexity
    algo), add.
    - If found, verify that the strings indeed also match?

    It seems to me this could be quite a lot of speed-gain for seemingly little
    work - replacing a quite heavy strcmp n^2 complexity with something like
    n*log(n) integer comparisons.

    This of course assumes these mentioned integral values are (also/instead of)
    stored in the document, but I don't think adding an integer to a type would
    take more than ~30 seconds. :-)

    /Mike



    This archive was generated by hypermail 2.1.4 : Wed Sep 24 2003 - 12:13:07 EDT