Re: 5291 RTF import slow - the solution :-)

From: Mike Nordell (tamlin@algonet.se)
Date: Wed Sep 24 2003 - 11:58:25 EDT

Next message: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"

Previous message: Hubert Figuiere: "Re: 5291 RTF import slow - the solution :-)"
In reply to: Hubert Figuiere: "Re: 5291 RTF import slow - the solution :-)"
Next in thread: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"
Reply: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"
Reply: Ted Lemon: "Re: 5291 RTF import slow - the solution :-)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hub wrote:

> We are not trying to save a couple of bytes, but
> trying to speedup lookups in the PieceTable, where the lookup are being
> performed by comparing large keys (strings) were the keyspace is tighter
> and defined (we have a defined and finite set of keys, aka properties).

Right. So what about creating the kind of mapping I mentioned. When loading
a document, every string would obviously have to be str(i)cmp'd once. But
what if handing it off to some little function that returns a type which
holds both the string and an integral value which that function derives from
the string? Perhaps put them in a map (to get log(n) lookup) where that
integer would be key and the real string would be the value. From that point
on, everytime a lookup is needed, only a simple comparison between integers
would be needed.

Insertions of new strings into such a set would probably be something like:
- Derive integer from new string.
- If integer not found in existing map (using mentioned log(n) complexity
algo), add.
- If found, verify that the strings indeed also match?

It seems to me this could be quite a lot of speed-gain for seemingly little
work - replacing a quite heavy strcmp n^2 complexity with something like
n*log(n) integer comparisons.

This of course assumes these mentioned integral values are (also/instead of)
stored in the document, but I don't think adding an integer to a type would
take more than ~30 seconds. :-)

/Mike

Next message: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"
Previous message: Hubert Figuiere: "Re: 5291 RTF import slow - the solution :-)"
In reply to: Hubert Figuiere: "Re: 5291 RTF import slow - the solution :-)"
Next in thread: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"
Reply: Raphael Finkel: "Re: 5291 RTF import slow - the solution :-)"
Reply: Ted Lemon: "Re: 5291 RTF import slow - the solution :-)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Wed Sep 24 2003 - 12:13:07 EDT