Re: Doc Auto Save / Fast Save / Threads [Was: Instability]

Paul Rohr (paul@abisource.com)
Thu, 18 Nov 1999 12:44:56 -0800


Continuation of a thread originally begun on the abiword-user list:

http://www.abisource.com/mailinglists/abiword-user/99/November/

At 08:18 AM 11/17/99 -0600, Jeff Hostetler wrote:
>> I purpose a option to automatically save to work to say,
>> ~/.Abisource/word.autosave/...
>
>you probably want to consider just putting it in file in the same directory
>with the original document. this might help it keep from getting lost...

If anyone's seriously considering implementing a feature like this, it's
important to be clear on which design model for autosave we're talking
about. I know of at least two, plus two orthogonal ones:

1. Just crash recovery.
------------------------
One approach that I vaguely remember from old versions of Word on the Mac is
session-oriented, fairly transitory, and whose sole purpose is to help you
if the machine crashes.

In this case, we'd always automatically save a spare copy of the documents
you're currently working on in a single known location. These are blown
away just before the app shuts down normally, and probably as each document
gets successfully saved.

That way, on launch if there's anything in that spot, we'd attempt to
"recover" those files (perhaps warning the user what happened, and letting
them look at the results and decide whether they're worth saving). As long
as you never crash while running the app, this approach never kicks in.
Files only get saved when the user chooses.

2. Autosave in place.
----------------------
I believe that this is what Jeff is thinking about. It allows any file to
be autosaved in place, but this feature *must* be enabled on a per-file
basis (not globally and not by default).

Allowing certain files to be automatically saved can be a convenience in
certain situations. If you know that the version you're currently typing is
always more useful to you than the last version saved to disk, then this can
be a convenient feature for some people.

However, other people in other situations work differently. In this other
mode, the changes you're currently making may or may not be more useful to
you than the last version you explicitly saved. You always want to make an
explicit choice when to save a new version, thus deliberately replacing the
last saved version with something that's definitely better.

3. Fast-save.
--------------
This isn't really either of the first two (although something like it can be
used as a technique for implementing either of them).

Our current file format that we save to disk is a straightforward linear
description of one state of the document. By design, it's very easy to read
and work with.

In memory, we use a very different representation of the document (piece
tables) that's much more complex, to allow for efficient editing, and
especially for infinite undo/redo. When we save out the document we convert
from this representation to the (much simpler and cleaner) file format. A
very similar conversion is continuously happening in order to show you the
current state of the document, so this process is already tuned to be quite
efficient.

In practice, keeping the two separate is a very very Good Thing. (Your
head-in-the-clouds CS prof was right on this one.)

Fast-save violates that rule. Why? Basically, it's an archaic feature from
when Macs and PCs were painfully slow. Users complained that some documents
took "too long" to save.

So, instead of translating between two representations like this, some word
processors hacked out a speedup by just blindly dumping a copy of the entire
messy in-memory representation to disk pretty much as is. (Indiscriminately
copying big chunks of memory is always faster than doing anything even
slightly intelligent with smaller chunks.) It's terribly wasteful of disk
space and can introduce *tons* of compatibility problems. However, at the
time, it gave them a speedup users could notice, so they did it.

4. Version save.
-----------------
Again, this is yet another class of features. How you implement it depends
on what granularity of changes are interesting to a wide-enough variety of
users.

On one end of the spectrum, it's technically possible to preserve every
change ever made in the entire lifetime of the document at the keystroke
level. That's a *lot* of information to hang on to, and can take a lot of
resources (disk space, memory, and time) just to keep track of it all. This
is exactly what's done during a single session to implement infinite undo,
and Jeff's quite capable of extending that mechanism to go back in time
forever.

However, you'd be surprised how much of that information is useless. When
you step through that kind of data, you'll find that very few people type as
cleanly as they think they do.

Don't believe me? The next time you spend an hour typing without closing
your document, keep hitting undo until it stops. :-) Then remember that
any decent undo implementation has already done a lot to compress that
history and smooth it out for you. Now imagine a document that you've been
working on for a year.

A more compact alternative would be to just keep track of discrete changes
from version to version. However, this is likely to require a different
mechanism to do it properly. Until someone's ready to try to implement
this, I'll hold off on exploring the design alternatives.

bottom line
-----------
I haven't thought enough about this to know which design I'd prefer, but I
wanted to make sure we all have a common vocabulary for discussing the
implications of various design choices.

>As for the file format, i wouldn't want to add this to the .abw format.

I agree. Word did this to their file format a long time ago, and have
created an unholy mess. Let's not repeat that mistake.

Paul



This archive was generated by hypermail 1.03b2.