Re: unique id generation

From: Martin Sevior (msevior@physics.unimelb.edu.au)
Date: Tue May 20 2003 - 19:08:25 EDT

Next message: Andrew Dunbar: "Commit: MSVC6 build system update"

Previous message: ericzen: "AbiWord Weekly News #144 (2003, week 20) released, Belated"
In reply to: Tomas Frydrych: "unique id generation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

On Sat, 17 May 2003, Tomas Frydrych wrote:

>
> The way we generate id's for various things in the document, such as
> headers, footnotes and lists is a real mess; in our code there are
> three different methods used to create what are are supposed to be
> unique id's:
>
> 1. using UT_rand: randomness does not guarantee uniqueness, as the
> generator can produce duplicate numbers before exhausting all
> possibilities. It creates the impression of uniqueness, which will
> sooner or later come to haunt us.
>

We had this conversation about a year ago. Pat Lam's UT_rand code means
the chances of generating the same random number are roughly 1 in 2^32
which is roughly 1 in 4 billion.

Documents have at most 100 unique id's which still means the odds of
generating a previously used random number are 1 in 40 million.

We have to substantially reduce our other bug counts before these sort of
bugs become a limiting factor in AbiWord's usefulness.

Having said that, I think a document unique class is a great idea because
new hackers won't try to change it again.

I'll have a look at the code now.

Cheers

Martin

> 2. Using sequential generation: this is used in bits and pieces of
> our importer code; it guarantees uniquness while importing, but once
> we enter editing mode, we have lost control. Further more, as there
> is no relationship between the state of the rand generator used in
> the editing mode and the id's sequentially generated on import, our
> chances of non-uniquness are incereased.
>
> 3. reusing id's stored in the imported document; same problem as (2).
>
> Neither of the three methods above is good enough, and the present
> state of things has to change before the 2.0 release. The bottom
> line: unique id's need to be generated on document level and have to
> be unique.
>
> So I have implemented UT_UniqueId class (ut_misc.h/cpp) and
> PD_Document methods getUID(type) and setMinUID(type, min), which I
> will commit as soon as I can get into the CVS, and I am going to
> start working on moving all our code to using these.
>
> For the mainteners of importers, this is how it will work: (1) when a
> unique id is to be generated it will be generated using the
> PD_Document::getUID() method. (2) if a unique numerical id stored in
> the document is being reused, the importer needs to call
> PD_Document::setMinUID() to ensure that any future generated id's
> will be greater than this. (3) if a unique alphanumerical id stored
> in the document is being reused, the importer needs to take similar
> steps to ensure that this id cannot be reduplicated in the editing
> mode.
>
> Exporters: stick to numerical id's as much as possible, it will make
> our life much simpler.
>
> Tomas
>
> Tomas
>

Next message: Andrew Dunbar: "Commit: MSVC6 build system update"
Previous message: ericzen: "AbiWord Weekly News #144 (2003, week 20) released, Belated"
In reply to: Tomas Frydrych: "unique id generation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Tue May 20 2003 - 19:22:28 EDT