From: Mike Nordell (tamlin@algonet.se)
Date: Wed Sep 24 2003 - 11:58:25 EDT
Hub wrote:
> We are not trying to save a couple of bytes, but
> trying to speedup lookups in the PieceTable, where the lookup are being
> performed by comparing large keys (strings) were the keyspace is tighter
> and defined (we have a defined and finite set of keys, aka properties).
Right. So what about creating the kind of mapping I mentioned. When loading
a document, every string would obviously have to be str(i)cmp'd once. But
what if handing it off to some little function that returns a type which
holds both the string and an integral value which that function derives from
the string? Perhaps put them in a map (to get log(n) lookup) where that
integer would be key and the real string would be the value. From that point
on, everytime a lookup is needed, only a simple comparison between integers
would be needed.
Insertions of new strings into such a set would probably be something like:
- Derive integer from new string.
- If integer not found in existing map (using mentioned log(n) complexity
algo), add.
- If found, verify that the strings indeed also match?
It seems to me this could be quite a lot of speed-gain for seemingly little
work - replacing a quite heavy strcmp n^2 complexity with something like
n*log(n) integer comparisons.
This of course assumes these mentioned integral values are (also/instead of)
stored in the document, but I don't think adding an integer to a type would
take more than ~30 seconds. :-)
/Mike
This archive was generated by hypermail 2.1.4 : Wed Sep 24 2003 - 12:13:07 EDT