Re: unique id generation

From: Tomas Frydrych (tomas@frydrych.uklinux.net)
Date: Fri May 23 2003 - 06:19:34 EDT

  • Next message: Tomas Frydrych: "commit (HEAD): Word and Aw importers"

    > fjf wrote:
    > The real problem is not the generation of unique IDs, which is simple
    > enough to do, but rather the merging of document fragments each of
    > which has its own set of IDs and ID references. When merging
    > documents, the safest method (though probably very difficult as long
    > as we use linear import) is to replace all IDs with new
    > document-unique IDs and to map all ID references to the appropriate
    > new IDs.

    This is actually not such a great problem AFAICS, since the only way
    we merge document fragments is through cut & paste. I am not sure if
    this is true across the board, but on win32 this is done via the rtf
    importer/exporter, which does not use any unique id's in the four
    cases where in AW the id matches that of something else in the
    document (headers, footnotes, endnotes and lists) -- the id's for the
    imported stuff are generated afresh.

    But even if we were merging an AW xml fragment into AW document, this
    is not such a great problem because the fact that id is not unique
    will become obvious when it is first encountered, i.e., there is no
    back-remapping to be done. The importer needs to call
    PD_Document::isIdUnique() when it first encounters an ID of a given
    type (e.g. footnote in the footnote reference) and if it is not
    unique remap it. When it encounters the corresponding id later on
    (footnote body, footnote anchor), it just replaces it.

    In response to Martin, I really do not like using UT_rand for this
    beacause (1) it is flawed in principle (random != unique), (2) it
    makes verification of uniqueness difficult. However, the reason why I
    felt changes had to the be done prior to 2.0 is not the use of
    UT_rand (I accept the probabilites of repetition are low), but the
    lack of unified policy on generation of these id's. We can debate and
    change how the id's are generated, but the generation has to be done
    centrally from the document class, and there needs to be a
    verification mechanism so we can at least issue a dbg message if a
    non-unique id is used.

    Tomas



    This archive was generated by hypermail 2.1.4 : Fri May 23 2003 - 06:50:27 EDT