SHAZAM - POW - Beginning the Binary Word Exporter (Part 1a) (again)


Subject: SHAZAM - POW - Beginning the Binary Word Exporter (Part 1a) (again)
From: James Montgomerie (jamie@montgomerie.net)
Date: Tue Mar 28 2000 - 08:14:07 CST


[I sent this a couple of days ago, with the patches simply attached, not in
a zip. I guess the file size was too large or something, as it didn't get
through]

--

Attached are two patches, one for the abi tree, one for the wv tree, implementing part 1a of the 'Beginning the Binary Word Exporter' POW (see below). It compiles under Linux, and I have no reason to expect that it won't work on the other platforms too.

On the subject of apparent high-processor-usage hangs (I say apparent because if you leave it for around ten minutes the import completes and AbiWord returns control, with a loaded document, to the user) when loading some Word documents, my trusty copy of DDD led me to line 174 of picf.c in wv. I've commented the offending statement out, meaning that all my Word documents which used to 'apparently' hang AbiWord now load, but it was pure [educated] guesswork - maybe someone with more knowledge of the Word file format could take a look at it? (It seems to be something to do with drastically over-estimating the size of images we have to skip, though the files causing the hang don't always have images in them, so I suspect it's something deeper). Mail me and I'll send you a sample 'hang causing' Word file.

Jamie. (who's not looking forward to delving into OLE further for Part 1b :-)

> scope > ----- > Part 1a. > Abstract wv's current OLE stream reads > > This requires no knowledge of the Word format or, for the most part, > wv functionality. We just want to improve wv's existing file support > functions to make them a little more versatile. > > wv currently uses a set of functions (in wv/support.c) along the > lines of: > U16 read_16ubit(FILE*); > U32 read_32ubit(FILE*); > U16 dread_16ubit(FILE*, U8**); > U32 dread_32ubit(FILE*, U8**); > U8 dgetc(FILE*, U8**); > > The normal getc(FILE*) function is also used throughout the code. > > We should modify the above functions, and all of the existing > wv code (only that which is reading from OLE streams, of course). > to make use of a wvStream* in place of a FILE*. wvStream > will initially be a typedef to FILE (ie 'typedef FILE wvStream'). > Don't forget the wvOLE* functions, too, which is were the > abstraction begins. You get FILE* from the old OLE code, and you'll > cast them to wvStream* here. > > Also, all getc's will be replaced with a new support function. > "U8 read_8ubit(wvStream*);" seems like a logical choice. While > we're at it, I'd like to move dgetc renamed to dread_8ubit, too. > > This abstraction lets us replace the OLE back-end transparently > to the rest of the code, which is the next step.




This archive was generated by hypermail 2b25 : Tue Mar 28 2000 - 08:14:31 CST