Re: XHTML


Subject: Re: XHTML
From: Eric W. Sink (eric@sourcegear.com)
Date: Wed Jan 26 2000 - 16:50:42 CST


Basically, I agree with this. I just feel like restating it for the sake of redundancy:

1. XHTML and HTML are distinct formats. For the purpose of
implementation, we should treat them as completely unrelated.

2. An HTML exporter is an important feature now. An XHTML exporter
is a more forward-looking idea. Having both is okay. Replacing our
HTML exporter with XHTML is not okay, since HTML is widely supported
by other vendors and XHTML is not.

3. An XHTML importer is possible, but this seems even less urgent
than XHTML export, for the same reasons.

4. An HTML importer is a Bad Idea. If you want to waste time, go see
a really bad movie instead. :-)

--

> It looks like you're asking two different questions here: > > 1. XHTML (easy) > ----------------- > Does anyone object to having an awesomely thorough XHTML importer and > exporter? I can't see why anyone would. XHTML looks like a clean new > format that people may eventually start supporting elsewhere. > > 2. old-style HTML (???) > ------------------------ > What do we do about old-style HTML? This is the more contentious question. > There are massive quantities of this content out there in the world, and > writing tolerant parsers which "properly" handle it is very, very, very, > very, very hard. Did I mention that it's hard? :-) > > <rant> > In fact, I'd be willing to go out on a limb and assert that reliably > importing all the HTML misfeatures out there in the world may be *harder* > than reliably importing Word documents. At least the Word family of file > formats is deterministic. There are only a handful of discrete binaries > producing content in that format, so eventually all of its quirks can be > reverse-engineered. HTML may look simpler, but it's a mess. > </rant> > > This is why we don't have, and may never have, an HTML importer. > > However, having an HTML *exporter*, even a dumb one, is very useful. Since > that format is so prolific, being able to export *our* content in that > format is important. More specifically, being able to export our content in > *some* form that those browsers can read is what's important. > > bottom line > ----------- > I think the reason you've been getting so much static here is that you've > been proposing to morph the existing HTML exporter into an XHTML exporter. > > Since our impexp framework is so modular, why not have both for now? If > you're right about the idea that old-style HTML is no longer relevant, > because it's been supplanted by XHTML, then we can just drop that code, and > (perhaps) drop the X from the name of the new code. > > Paul

-- Eric W. Sink, Software Craftsman SourceGear Corporation eric@sourcegear.com



This archive was generated by hypermail 2b25 : Wed Jan 26 2000 - 17:12:37 CST