Re: AbiWord DTD


Subject: Re: AbiWord DTD
From: Emile Snyder (emile@reed.edu)
Date: Wed Feb 23 2000 - 15:13:09 CST


I am not an active Abiword hacker, so I'm not sure how much weight my
opinion ought to carry, but for what it's worth, here it is :)

While I fully agree that putting out a doc that says "this is the file
format" while that format is in flux is pretty confusing, I think that the
longer term goal is a Good Thing.

Once the tree stabalizes, and a 1.0.x series shows up, I don't think that
the right answer for someone asking "what's the file format" is "go read
the source." Open standards are very good, and having file formats which
are (correctly) documented is great. People wanting to write scripts
which operate on abiword files, people writing other apps which output
abiword documents, etc. will all love you :) Imagine, say, an OCR program
which wants to give you the option of trying to preserve some formatting
by outputting abiword format.

While clearly, the "real" abiword format will always be whatever actually
works in the program, it seems to me that a high priority goal should be
making the program conform to some abstract file format specification, ie.
a DTD, rather than vice versa. Abiword clearly won't exist in a vacuum,
and it seems to me that embracing this fact rather than fighting it will
make things happier all around.

Please don't take this to imply that I think there is any onus right now
to have this format in hand, just that it would be something I would
dearly love to see, as a user, with the first stable release.

Thanks to all the abiword hackers for all their great work. I anxiously
await new releases :)

-emile

On Tue, 22 Feb 2000, Paul Rohr wrote:

> Yikes, it's a DTD! Guess that's what I get for taking the holiday weekend
> off, huh?
>
> To be honest, I have *very* mixed emotions about seeing this. DTDs just
> look so darned official, you know? It makes people think they know exactly
> what does and doesn't belong in our file format. But they're wrong.
>
> As one of the designated file format enforcers, I've gleefully taken
> advantage of the fact that nobody thinks they understand our file format
> well enough to write a DTD. You *have* to look at the source of our product
> to know what the file format currently is.
>
> IMHO, this lack of documentation is a Good Thing.
>
> A. No documentation is better than wrong documentation.
> --------------------------------------------------------
> This may sound like heresy, but don't get me wrong. Documentation is good.
> I *like* documentation -- provided it's accurate.
>
> However, there have been far too many times in my life that I've gotten
> burned by misleading or outdated documentation, particularly when I didn't
> *know* how flawed it was. I'd much rather do without documentation than get
> confused by something that's inaccurate.
>
> B. The file format is currently in flux.
> -----------------------------------------
> It's been quite a while now since there were any changes to the file format,
> but that's about to change, in a big way.
>
> For example, Luke has submitted a patch with file format changes to make
> lists (mostly) work, but he hasn't yet gotten any serious feedback from
> anyone else. (I don't know whether anyone else plans to comment on it, but
> it's definitely at or near the top of my list.)
>
> Likewise, we know that empty field tags are going to be replaced with a
> barrage of new container-style markup. Here again, there's a proposal on
> the table which hasn't gotten any serious feedback yet.
>
> Even worse, it's pretty clear to me that these two proposals will interact
> in ways which probably affect the file format, too.
>
> C. Which version of the file format are we documenting here?
> -------------------------------------------------------------
> It looks like this DTD probably includes Luke's proposed changes, but omits
> a number of other features of the file format which do exist.
>
> D. Any DTD which doesn't get used is probably wrong.
> -----------------------------------------------------
> The main use for DTDs is to determine whether an XML/SGML document is
> "valid" or not. However, we do not and will not ever be using DTDs in this
> way for the AbiWord file format. A valid AbiWord document is one that we
> read. Period.
>
> If there's ever a conflict between a DTD and our behavior, then the DTD is
> wrong. That's not a characteristic people usually expect of DTDs. They
> want it to be the other way around, but it's not.
>
> bottom line
> -----------
> If we never publish a DTD for our file format, I wouldn't shed any tears.
> It's not needed for XML compliance, and I think having one sets false
> expectations among hard-core XML bigots.
>
> However, I will grudgingly admit that a DTD would make a nice formal piece
> of read-only documentation which helps describe (in a more formal sense) how
> our file format works. In that vein, if people really really want to have
> one, I won't get in the way, provided that:
>
> - we wait until the file format *is* stable,
> - we have someone committed to maintaining its accuracy, AND
> - we clearly state that the DTD is descriptive, *not* normative
>
> We are not at that point now, so my assertion is that publishing the DTD now
> does far more more harm than good.
>
> Paul,
> designated curmudgeon
>
>
>

-------------------------------------------------------------------
ESR: I want to live in a world where software doesn't suck.
RMS: Any software that isn't free sucks.
Linus: I'm interested in free beer.
          - As reported by Elizabeth O. Coolbaugh of LWN
            from LinuxWorld Conference and Expo
-------------------------------------------------------------------



This archive was generated by hypermail 2b25 : Wed Feb 23 2000 - 15:13:24 CST