Re: unique id generation

From: Martin Sevior (msevior@physics.unimelb.edu.au)
Date: Tue May 20 2003 - 19:08:25 EDT

  • Next message: Andrew Dunbar: "Commit: MSVC6 build system update"

    On Sat, 17 May 2003, Tomas Frydrych wrote:

    >
    > The way we generate id's for various things in the document, such as
    > headers, footnotes and lists is a real mess; in our code there are
    > three different methods used to create what are are supposed to be
    > unique id's:
    >
    > 1. using UT_rand: randomness does not guarantee uniqueness, as the
    > generator can produce duplicate numbers before exhausting all
    > possibilities. It creates the impression of uniqueness, which will
    > sooner or later come to haunt us.
    >

    We had this conversation about a year ago. Pat Lam's UT_rand code means
    the chances of generating the same random number are roughly 1 in 2^32
    which is roughly 1 in 4 billion.

    Documents have at most 100 unique id's which still means the odds of
    generating a previously used random number are 1 in 40 million.

    We have to substantially reduce our other bug counts before these sort of
    bugs become a limiting factor in AbiWord's usefulness.

    Having said that, I think a document unique class is a great idea because
    new hackers won't try to change it again.

    I'll have a look at the code now.

    Cheers

    Martin

    > 2. Using sequential generation: this is used in bits and pieces of
    > our importer code; it guarantees uniquness while importing, but once
    > we enter editing mode, we have lost control. Further more, as there
    > is no relationship between the state of the rand generator used in
    > the editing mode and the id's sequentially generated on import, our
    > chances of non-uniquness are incereased.
    >
    > 3. reusing id's stored in the imported document; same problem as (2).
    >
    > Neither of the three methods above is good enough, and the present
    > state of things has to change before the 2.0 release. The bottom
    > line: unique id's need to be generated on document level and have to
    > be unique.
    >
    > So I have implemented UT_UniqueId class (ut_misc.h/cpp) and
    > PD_Document methods getUID(type) and setMinUID(type, min), which I
    > will commit as soon as I can get into the CVS, and I am going to
    > start working on moving all our code to using these.
    >
    > For the mainteners of importers, this is how it will work: (1) when a
    > unique id is to be generated it will be generated using the
    > PD_Document::getUID() method. (2) if a unique numerical id stored in
    > the document is being reused, the importer needs to call
    > PD_Document::setMinUID() to ensure that any future generated id's
    > will be greater than this. (3) if a unique alphanumerical id stored
    > in the document is being reused, the importer needs to take similar
    > steps to ensure that this id cannot be reduplicated in the editing
    > mode.
    >
    > Exporters: stick to numerical id's as much as possible, it will make
    > our life much simpler.
    >
    > Tomas
    >
    > Tomas
    >



    This archive was generated by hypermail 2.1.4 : Tue May 20 2003 - 19:22:28 EDT