Re: XML_Char


Subject: Re: XML_Char
From: Mike Nordell (tamlin@algonet.se)
Date: Mon Dec 18 2000 - 04:51:42 CST


Vlad Harchev wrote:
> > Nope. Perhaps I didn't express myself clear enough, but wrong? Nope. :-)
>
> As usual :)

Is there any other way? :-O ;-)

> > Speaking of compilers, now you're wrong. You can *never* express an
UTF-8
> > string in a char[], you need an unsigned char[], that is inherently
> > incompatible with a C/C++ string literal.
>
> I doubt that C++ spec guarantees such problems - it seems it declares
this
> aspect as implemenation-dependant.

That would be the day, when the C++ spec guaranteed *problems* not promises.
;-)

AFAIK, it doesn't explicitly mention using negative char values in string
literals since that is by itself an "implementation specific" thing. Sure, I
can create a string literal containing "char"-negative values, e.g.
"foo\xa2bar" which would contain an embedded 162 value that would be usable
on at least 93% of current platforms *if "casted" to an unsigned char*, but
would it be (conceptually) resonable to assign this literal to a char*? No,
since the usual implementation would evaluate its fourth element to the
value -94 (or thereabout, I didn't bother to check).

> At least I didn't have problems using chars
> with value > 128 in arrays declared as char[] and never had problems with
them
> with any compilers I had. So it's safe for UTF8 too. May be just
compiler's
> authors are smart.

It's not the act of pointing a char* at memory that contains char-negative
values that is the problem, it's reading from that memory using the char
data type that is.

[...]
> > I was speaking from a
> > conceptually POV. I care, and I'd imagine that all maintainers-to-be
also
> > cares. It just happened to coincide with the fact that the compiler
*can't*
> > implicitly (without emitting diagnostic(s)) convert a C++ string literal
> > into an "unsigned char*".
>
> Yes, I agree with this. But most compilers are rather smart to allow
> disabling particular warnings, so it doesn't hurt that much.

"Most compilers" and "disable ... warnings" doesn't really cut it when we're
talking about XP code. Either we comply with C++ or we don't. In this case
you suggest we don't, and I strongly disagree.

> > But back to the issue that started this thread. Why do we even use
functions
> > that only allow XML_Char* as input when we mostly give them C++ string
> > literals (which are const char[])? Not only is all this casting bad,
ugly,
> > wrong and code-bloat. It's conceptually wrong.
>
> May be we should provide an overloaded wrappers that will just cast their
> args to XML_Char* and call original functions?

Yes. I think this is the cleanest (i.e. most non-intrusive) suggestion to
date. Inline wrappers that does the casting.

Btw, I have once again admit I made a small error. OK, I didn't look at the
standard while writing that particular part.

A string literal is not of type "const char[]" but of type "char[]". This is
an (unfortunate) C inheritance, where "foo"[2] = '\0'; is legal. :-(

And while at bad code, did you know the following was legal?

int index = 2;
char* pFoo = "foo";
index[pFoo] = '\0';

Try it, you might be surprised it compiles. :-)

/Mike - please don't cc



This archive was generated by hypermail 2b25 : Mon Dec 18 2000 - 04:50:15 CST