> > Headers: (Everything within the <HEAD> tag)
> >
> > Basically, I'll leave the headers alone. I might save them into memory for a
> > HTML export, but other than that, they're not needed for AbiWord.
>
> Wrong. When AbiWord gets a little I18N support, you'll need to look for
>
> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-2">
>
> if the document was written in Latin 2. That's the only way to declare
> code page within the document. It would also be nice to look for LANG
> attribute(s), such as
>
> <HTML LANG=en>
Thanks for pointing this out. Before I step off the the edge, I'll read the
HTML specs again, and re-assess what we need from the headers.
> > Comments?
>
> Good luck. You're gonna need it.
Thanks ;-)
Seriously, though, I'm not going into this with the approach of trying to
interpret what the user's intention was, unless it just fits in. If there's
bad HTML, don't expect a fuzzy logic mechanism that will interpret what the
user's intention was. That's not my style ;-)
-- -- Michael Samuel <michael@surfnetcity.com.au>