Re: bug 629


Subject: Re: bug 629
From: Paul Rohr (paul@abisource.com)
Date: Tue Jan 25 2000 - 16:01:03 CST


At 12:32 PM 1/25/00 -0600, sam th wrote:
>This bug talks about the fact that we export forced line breaks to html as
><p> rather than as <br> which is the html 4.01 spec for forced line
>breaks. However, IMHO we should not attempt to change this behavior. The
>html spec forbids the existence of </br> tags, and abiword, when exporting
>html adds a close tag to for each open one. I feel that this is behavior
>that we want to preserve (it makes sure that the HTML is valid XML, for
>one thing). Thus, we should not attempt to change from <p> to <br>.

Let me make sure I understand what you're proposing. When our HTML exporter
gets passed a forced line break, it could theoretically emit any of the
following strings:

  string rationale
  ------ ---------
  </p><p> what we have now
  <br></br> gibberish
  <br> what HMTL expects
  <br/> the XML equivalent of what HTML expects

To be honest, I don't like *either* of the first two options. Option #3
seems like the safest choice -- it's *exactly* what all web browsers
understand as HTML, so we're guaranteed that everyone will be able to
interpret it properly. After all, a gazillion existing documents can't be
wrong, can they? :-)

Sure, it'd be nice to have XML-friendly HTML, but at what cost? There are a
*lot* of unclosed tags in HTML and remapping them to other tags (like option
#1) not only produces misleading HTML, it wont always work. For example,
what other tag would you map IMG to?

Making our HTML output XML-friendly is only an option if we don't break
compatibility with existing Web browsers to do so. Of the XML-friendly
equivalents, #2 would probably happen to work in many existing browsers
insofar as they ignore bogus markup, but it's so egregiously ugly that I'd
hate to do it.

Does anyone know which browsers would barf on the XML-friendly option #4?
Since a lot of the early browser parsers still in use were hand-crafted in
isolation long before XML existed, I doubt they all behave similarly in this
screw case.

In short, being XML-friendly in our HTML output probably isn't worth doing
unless it's *very* easy. Option #3 still sounds just fine to me.

Paul



This archive was generated by hypermail 2b25 : Tue Jan 25 2000 - 15:55:44 CST