Subject: Re: Importing HTML
From: Dom Lachowicz (cinamod@hotmail.com)
Date: Tue Apr 24 2001 - 11:24:29 CDT
>Any chance of a partial/lossy import, ignore all unknown tags, dump all
>unmatched tags ...???
We already ignore unknown tags (such as frames or tables). I have *no* idea
how to get expat or libxml2 to not choke on unmatched tags. I'm not sure
that we would even want to do this. We'd need a parsing engine that's a lot
more complex like Gecko to do this "correctly."
>More simply, what im trying to suggest is, "Error: this document
>contains
>invalid HTML. Would you like to import it as plain text with line
>breaks"
This'd be ok, but you can already import HTML as text (choose "open as
text"). Are you suggesting that we remove the markup tags on import?
>Please :)
>
>or as a temporary measure we could recommend a HTML validator like:
>http://validator.w3.org/
This is a reasonable suggestion.
Dom
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com
This archive was generated by hypermail 2b25 : Tue Apr 24 2001 - 11:24:35 CDT