Re: Commit: XHTML fix

From: Dom Lachowicz (doml@appligent.com)
Date: Fri Mar 01 2002 - 18:12:56 GMT

  • Next message: Michael D. Pritchett: "commit - keep Win32 build working"

    > This looks to me like this fix was in the non-plugin code, which I
    > believe Dom specified was deprecated. Perhaps you want to add the fix
    > in the plugin code?
    >
    > Does anyone else see the fact that we have two different sections of
    > code in development for the same feature as confusing, misleading, and
    > frustrating?

    The plugin code is being maintained independently from the other HTML
    importer. There are some reasons and advantages and disadvantages for
    doing this:

    1) Plugin handles HTML, non-well-formed HTML, and XHTML, not just XHTML.
    2) Plugin is theoretically more robust.
    3) Plugin adds a libxml2 dependency. XHTML importer does not.
    4) other stuff that I'm forgetting

    So theoretically, the HTML plugin is more robust and can handle more
    kinds of inputs. The two importers scratch different itches for
    different people. The HTML importer doesn't work on Win32 (though it
    probably could be made to), which is a definite drawback.

    I don't see these as the same features but as distinct. It just so
    happens that XHTML is derived from HTML so there could theoretically be
    an overlapping of code. In practice there isn't much overlapping.

    Honestly, no one has worked on the XHTML importer in quite some time,
    and it's showing its age. If Hub or whomever would like to get it
    working to fix bug 1406, so be it. It's their time, not mine.

    Hopefully after 1.0 we can put our full concentration into the HTML
    importer plugin and making that work on a large number of documents and
    platforms, but that is by no means a guarantee.

    Ideally, the XHTML importer will be deprecated and scrapped post-1.0
    when we move most of the import/export architecture over to using
    plugins. But for now, it stays.

    We're not Mozilla. A real HTML importer is a *ton* of work if we're to
    do even a passable job. For now I'm willing to accept a
    less-than-perfect implementation, especially considering the load of
    crap that is the HTML standard and the larger load of crap that
    represents all of the misformed documents on the internet.

    Dom





    This archive was generated by hypermail 2.1.4 : Fri Mar 01 2002 - 13:20:36 GMT