Subject: Re: bogus document
From: WJCarpenter (bill-abisource@carpenter.ORG)
Date: Mon May 14 2001 - 14:14:19 CDT
[[Copying to the dev list, where such details are probably of more
interest. Respondents should drop the user list off replies to
this.]]
sam> The most likely funny character is the ampersand, which we don't
sam> properly escape in some places.
Nothing simple like that. It's some 3-byte sequence for opening and
closing smart quotes. Probably something that isn't legit in UTF-8.
Here is an isolated dump of the bad and good cases:
:; od -xac bad.xml
0000000 3fe2 6d3f 6e65 7375 3fe2 0a3f
b ? ? m e n u s b ? ? nl
342 ? ? m e n u s 342 ? ? \n
:; od -xac good.xml
0000000 80e2 6d9c 6e65 7375 80e2 0a9d
b nul fs m e n u s b nul gs nl
342 200 234 m e n u s 342 200 235 \n
-- bill@carpenter.ORG (WJCarpenter) PGP 0x91865119 38 95 1B 69 C9 C6 3D 25 73 46 32 04 69 D6 ED F3
This archive was generated by hypermail 2b25 : Sat May 26 2001 - 03:51:04 CDT