Commit: import architecture now uses complex heuristics

From: Dom Lachowicz (doml@appligent.com)
Date: Mon Feb 25 2002 - 20:01:27 GMT

  • Next message: David Chart: "Commit: Docs Work"

    Well, almost - it's about 95% done. I basically just need to rewrite 1
    more method which works right now just as if my heuristic code hadn't
    been committed.

    I'll clean up my code shortly to use a Confidence datatype instead of a
    UT_uint8.

    Basically, everything that imports returns a normalized number between
    [0,255] with 0 being "I'm not at all confident", 127 being "I'm so-so"
    and 255 being "I can totally handle this file type". Applies to both
    recognizeContents and recognizeSuffix methods.

    What I'm going to do is heavily weight the recognizeContents method
    (maybe 85-15) and apply the following heuristic:

    my_match = heuristic(contentsConfidence, suffixConfidence);
    if ( my_match > best_match )
      best_filetype = my_match_filetype;

    This will fix a few bugs in bugzilla.

    Dom





    This archive was generated by hypermail 2.1.4 : Mon Feb 25 2002 - 15:08:12 GMT