Re: libxml2 guru(s) please respond (again)

From: Tomas Frydrych <tf_at_o-hand.com>
Date: Thu Jul 13 2006 - 09:56:18 CEST

Just looking at the XML 1.1, spec (at
http://www.w3.org/TR/2004/REC-xml11-20040204/) I think paragraph 2.10,
White Space Handling, is the problem:

'An XML processor MUST always pass all characters in a document that are
not markup through to the application.'

Since attribute values are part of the markup, this statement leaves the
decision what to to about the whitespace to the processor. The spec also
puts restrictions on what can be contained in an attribute, most
notably, xml references must not -- I think the only reliable way to
have whitespace preserved is to have the text as content (between
<></>), not attribute.

Tomas

Martin Sevior wrote:
> Hi Folks,
> I've found a bug in AbiCollab code to do with TAB's.
>
> Basically libxml2 is converting tab's to spaces in this function.
>
> szValue = (char *)xmlTextReaderGetAttribute(reader, (const xmlChar
> *)"value");
>
> This retrieve escapeXML'd text from the xmlparser. The text from the
> originating abiword is created with this code:
>
> UT_UTF8String sText;
> sText.appendUCS4(pChars,iLen);
> sText = sText.escapeXML();
>
> The escapeXML() method just converts <, >, ", and & and leaves TAB's
> intact.
>
> However upon being received the TAB's are converted to spaces via:
> szValue = (char *)xmlTextReaderGetAttribute(reader, (const xmlChar
> *)"value");
>
>
> So what should we do to fix this?
>
> Thanks!
>
> Martin
>
>
>
Received on Thu Jul 13 10:00:26 2006

This archive was generated by hypermail 2.1.8 : Thu Jul 13 2006 - 10:00:27 CEST