Re: mswordview, a word 8(97) converter and abiword

Eric W. Sink (eric@postman.abisource.com)
Thu, 3 Dec 1998 14:54:14 -0600


Excellent! One of our highest priorities is developing top-notch
import of the Word 97 file format. In fact, we had been hoping
to leverage your code to replace our current attempts at an
importer. We just hadn't got around to it yet.

Generically: If you're volunteering to help us get your
Word 97 reader code integrated with AbiWord, you will definitely
have our enthusiastic support. :-)

Our file format is XML-based, but it is unique to AbiWord.
We will provide you with documentation about it.

There are very few tags -- most of the actual meat
of an AbiWord file exists in properties which sit in XML
attributes. Those properties are very much like CSS, but
they are not strictly CSS-compliant.

AbiWord's file format
is XML-based, but it is not intended to be written by humans.
AbiWord's parser is ultra-strict and unforgiving. Developing
a standalone program which converts Word97 to AbiWordXML
is a good way to keep your development effort sane. However,
doing so will require you to generate files which are
excruciatingly perfect, or AbiWord's XML parser will throw
up in a big way. You might save yourself some effort by just
coding directly to the same API our XML importer uses. We
can help make that approach as painless as possible.

There are a number of word processor features which are not
yet supported by AbiWord. In many of those cases, we have
not yet figured out the relevant XML-based additions to our
file format. Having a good word importer will be an excellent
impetus for us to get a complete specification for the full
file format, even if the relevant features are not quite done
yet.

We look forward to working with you further. We'll get back
to you ASAP with as much info we can provide regarding our
file format.

--

> hi, ive got a fairly well working gpled word 97->html converter > at the moment, and id be greatly interested in converting it > to xml for abiword. The only catch is that im not too sure on > the xml front. > > Is it a simple matter of getting the list of xml tags that abiword > uses and spitting them out appropiately or is there some more formal > mechanism that i should use. Id like my first attempt to be a standalone > program that outputs xml, and once that works then integrate it more > closely with the importer mechanism of abiword. If all i need is a list > of tags, what is the current list of paragraph and character etc formatting > tags ? > > When looking around at the latest tarfile i see that theres a file or > two called word97 importer that looks like somone began to work from the > word spec, fwiw you have to extract the ole streams from ole docs first > and then the microsoft biff specs relate to the file offsets in the > extracted streams. the laolareplace.old.c and ole dirs of the latest > mswordview source demo this in operation. > > mswordview lives at > http://www.csn.ul.ie/~caolan/docs/MSWordView.html > mirror at www.gnu.org is temporarily out of comission > > C. > > Real Life: Caolan McNamara * Doing: MSc in HCI > Work: Caolan.McNamara@ul.ie * Phone: +353-61-202699 > URL: http://www.csn.ul.ie/~caolan * Sig: an oblique strategy > What would your closest friend do?

-- 
Eric W. Sink, Software Craftsman
eric@abisource.com


This archive was generated by hypermail 1.03b2.