fields -- ASSERT(all fields are actually chunks)


Subject: fields -- ASSERT(all fields are actually chunks)
From: Paul Rohr (paul@abisource.com)
Date: Tue Sep 26 2000 - 21:00:37 CDT


In an earlier message, I made the argument that we couldn't implement some
of the Word/RTF fields using a block-scope implementation. I invented the
word "chunk" to describe this latter class of fields:

  http://www.abisource.com/mailinglists/abiword-dev/00/September/0268.html

At the time, I believed that we could skip those "chunks" for now, and just
focus our attention on implementing "simple" editable fields now:

  http://www.abisource.com/mailinglists/abiword-dev/00/September/0271.html

Now, after more thinking -- and quite a bit of reverse-engineering -- I've
changed my mind. I'm now convinced that we should go ahead and implement
*all* fields as chunks.

the proposal
------------
In other words, our notion of fields would have the laxest possible content
model. *Anything* can go inside *any* field, including section breaks,
other fields, images, you name it. Fields simply become a bookeeping
convenience for remembering that an arbitrary chunk of a document which ...

  ... has attributes,
  ... was originally created as generated content using those attributes,
  ... can be regenerated at will (using those or newer attributes), and
  ... is also editable and undoable in all the usual ways.

There are a number of reasons for implementing fields this way:

  1. that's how RTF works
  2. that's what the Word, etc. UI allows
  3. we *do* need this flexibility for TOC, etc.
  4. we only have a single mechanism to maintain
  5. we don't have to worry about file format compatibility later
  6. it's really not *that* hard to implement

reason #1: that's how RTF works
--------------------------------
To be honest, I should have paid more attention to this earlier. For us to
be compatible with other word processors out there, we need excellent
support for as many RTF features as possible.

If our mechanism has the same expressive power as RTF's, that helps insure
that we won't get stuck with impedance mismatch problems when attempting to
to lossless conversions to/from other file formats.

Sure, it may be possible to write a lot of code to map between some other
way of doing fields and RTF's way, but it's a *lot* easier to do lossless
round trips if we employ a similar mechanism internally.

reason #2: that's what the Word UI allows
------------------------------------------
Offhand, it seems like there's no good reason to allow people to insert
section breaks inside *any* field. The idea seems so silly, that I went
"ick" the first time I realized that "other WPs" allow this.

Still, if there's content out there that takes advantage of this "feature"
(for example, something like a paragraph break inside a date and time stamp
might make a bit more sense), then do we want to run into import/export
hassles when wrestling with that document?

I'd rather take the same attitude as the Samba folks here -- why should a
Word user on the network have any idea that the person they're swapping
documents with is actually using AbiWord to edit those files?

reason #3: we need it for TOC, etc. anyhow
-------------------------------------------
This is probably the biggie. It's a lot easier to format a large chunk of
generated content if you allow yourself to use paragraph-level styles.

While it's possible to fake up a TOC using lots of line breaks and manual
formatting, that's a ton of work, and the end-result is fairly user-hostile.
Even though we still have work to do to add templates and make styles
editable, I'd hate to see us lock ourselves in that box.

reason #4: only one mechanism to maintain
------------------------------------------
If we're going to go ahead and do chunks eventually, then why compromise on
reasons 1 and 2 for other fields?

reason #5: no future file format worries
-----------------------------------------
By doing the groundwork now to make sure that we've implemented a mechanism
that will work for *all* fields we ever want to implement, we never have to
worry about file format compatibility.

reason #6: implementation isn't that bad
-----------------------------------------
Whoops. This is another assertion I need to prove, but this post is long
enough already and the twins need to be fed. More later.

Paul
motto -- I love being proven wrong



This archive was generated by hypermail 2b25 : Tue Sep 26 2000 - 20:54:56 CDT