RE: .abw import or export bug


Subject: RE: .abw import or export bug
From: Henrik Berg (henrik@lansen.se)
Date: Tue Feb 15 2000 - 09:38:43 CST


> Basically, what happens is that someone imports a document from Word, then
> saves it in .abw format. One of the characters in the document is in the
> high-byte range. AbiWord can now correctly import this from MSWord, but
> when the document is saved as .abw, it cannot be reopened. This is very
> disconcerting. On a personal note, it is causing me headaches, because
> people are mailing me wondering what is going on, but that's another story.
> I think the problem was probably around for a while, but not visible. The
> recent work by Justin to import multi-byte chars from Word has outpaced
> AbiWord's own import/export capability.
>
> I logged it as #762. I have included a small sample document. Is anyone
> up for looking into this?

It's very bad if people have files they can't open. This patch is dealing with that and nothing else. The real problem is still there.

I don't commit this since I'm leving tomorrow, and can't save the day if I do something wrong.

Some chars (ASCII 0 to 31) are illegal in XML and the only way to do this right is to be 100% sure that no export of these are done. Not even as  since that is expanded before XML test.

diff -u -r -N -x CVS -x WIN32_20.1_i386_DBG -x WIN32_20.1_i386_OBJ --minimal abi.org/src/wp/impexp/xp/ie_imp_AbiWord_1.cpp abi/src/wp/impexp/xp/ie_imp_AbiWord_1.cpp
--- abi.org/src/wp/impexp/xp/ie_imp_AbiWord_1.cpp Thu Jan 27 01:22:46 2000
+++ abi/src/wp/impexp/xp/ie_imp_AbiWord_1.cpp Mon Feb 14 16:06:18 2000
@@ -97,6 +97,15 @@
   size_t len = _readBytes(buf, sizeof(buf));
   done = (len < sizeof(buf));
 
+#if 1
+ // TODO - remove this then not needed anymore. In ver 0.7.7 and erlier, AbiWord export inserted
+ // chars below 0x20. Most of these are invalid XML and can't be imported.
+ // See bug #762.
+ for( int n1 = 0; n1 < len; n1++ )
+ if( buf[n1] >= 0x00 && buf[n1] < 0x20 && buf[n1] != 0x09 && buf[n1] != 0x0a && buf[n1] != 0x0d )
+ buf[n1] = 0x0d;
+#endif
+
   if (!XML_Parse(parser, buf, len, done))
   {
    UT_DEBUGMSG(("%s at line %d\n",




This archive was generated by hypermail 2b25 : Tue Feb 15 2000 - 10:59:58 CST