Topic: Casting to char pointers [was A Universal Swap Function]
Author: Michiel Salters<Michiel.Salters@cmg.nl>
Date: Fri, 26 Oct 2001 17:14:47 GMT Raw View
In article <yu84ronb6sc.fsf@mcowan-linux.transmeta.com>, Micah Cowan says...
>
>Michiel Salters<Michiel.Salters@cmg.nl> writes:
>
>> In article <yu8r8rsyjy7.fsf_-_@mcowan-linux.transmeta.com>, Micah Cowan says...
>>
>> >"Bart Kowalski" <me@nospam.com> writes:
>> >(The text Bart quotes is mine)
>> >
>> >> > C makes the additional guarantee that you may cast a pointer to T to
>> >> > pointer to character-type, and examine the bytes that wise (without
>> >> > necessitating any copying). In C++ this guarantee is *not* made, and
>> >> > the equivalent cast is a reinterpret_cast<>, whose mapping is always
>> >> > implementation-defined. IOW, a conforming C++ implementation may bomb
>> >> > if you try to examine the bytes of an object via a char * resulting
>> >> > from a cast; a conforming C implementation may not.
[ SNIP ]
>However, so far I have yet to see a paragraph that describes what a
>cast from arbitrary pointer to char* does, after searching rather
>thoroughly.
>
>Micah
Have you looked at 3.9/4 ? Since there are sizeof(T) bytes pointed to by
the unsigned char* ptr, and there are sizeof(T) bytes which contribute
to the representation,I think it follows these bytes must be the bytes
pointed to (since, as noted before, PODs don't have holes, just padding)
There's also 3.9.1/1 which shows all bits in a (unsigned) char contribute.
Regards,
--
Michiel Salters
Consultant Technical Software Engineering
CMG Trade, Transport & Industry
Michiel.Salters@cmg.nl
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Micah Cowan <micah@cowanbox.com>
Date: Fri, 26 Oct 2001 21:07:53 GMT Raw View
Michiel Salters<Michiel.Salters@cmg.nl> writes:
> >However, so far I have yet to see a paragraph that describes what a
> >cast from arbitrary pointer to char* does, after searching rather
> >thoroughly.
> >
> >Micah
>
> Have you looked at 3.9/4 ? Since there are sizeof(T) bytes pointed to by
> the unsigned char* ptr, and there are sizeof(T) bytes which contribute
> to the representation,I think it follows these bytes must be the bytes
> pointed to (since, as noted before, PODs don't have holes, just padding)
>
> There's also 3.9.1/1 which shows all bits in a (unsigned) char contribute.
Naturally. However, that's not the problem. The problem is that the
effect of reinterpret_cast<>ing to unsigned char* is unspecified, so
how do I get a useable unsigned char* pointing to that initial byte in
the first place?
Micah
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: "Bill Wade" <wrwade@swbell.net>
Date: Sat, 27 Oct 2001 09:31:19 GMT Raw View
"Micah Cowan" <micah@cowanbox.com> wrote
> Naturally. However, that's not the problem. The problem is that the
> effect of reinterpret_cast<>ing to unsigned char* is unspecified, so
> how do I get a useable unsigned char* pointing to that initial byte in
> the first place?
Who needs reinterpret_cast?
T t;
void* v = &t; // points to start of t, 4.10/2
unsigned char* c;
memcpy(&c, &v, sizeof(v)); // c has same representation as v (3.9.2/4)
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Micah Cowan <micah@cowanbox.com>
Date: Mon, 29 Oct 2001 21:47:39 GMT Raw View
"Bill Wade" <wrwade@swbell.net> writes:
> "Micah Cowan" <micah@cowanbox.com> wrote
>
> > Naturally. However, that's not the problem. The problem is that the
> > effect of reinterpret_cast<>ing to unsigned char* is unspecified, so
> > how do I get a useable unsigned char* pointing to that initial byte in
> > the first place?
>
> Who needs reinterpret_cast?
>
> T t;
> void* v = &t; // points to start of t, 4.10/2
> unsigned char* c;
> memcpy(&c, &v, sizeof(v)); // c has same representation as v (3.9.2/4)
Excellent! Still, I wish that C++ had left C's
cast-to-character-pointers in, since this still seems like a hack -
but at least it's well-defined.
Thanks,
Micah
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Micah Cowan <micah@cowanbox.com>
Date: Wed, 24 Oct 2001 19:05:28 GMT Raw View
"Bart Kowalski" <me@nospam.com> writes:
(The text Bart quotes is mine)
> > C++ does *not* make the same guarantees. The paragraph you referred
> > me to states that POD types can be *copied* to char or unsigned char
> > arrays, which is *exactly* what I said (the only portable way to
> > achieve such a copy is std::memcpy()).
>
> There is no need to start a war over this. I am trying to have an intelligent
> discussion based on certain things mentioned in the standard. See below.
Certainly not my intention to start a *war*. However, what good is
USENET if thoughtful debate doesn't produce a clearer understanding of
the C++ language? When this conversation/subthread is finished, and
the issue has been resolved, one of us will come away with a clearer
and better understanding.
To hasten this end, I have crossposted this to comp.std.c++, to see if
they can shed some further light on the subject.
> > C makes the additional guarantee that you may cast a pointer to T to
> > pointer to character-type, and examine the bytes that wise (without
> > necessitating any copying). In C++ this guarantee is *not* made, and
> > the equivalent cast is a reinterpret_cast<>, whose mapping is always
> > implementation-defined. IOW, a conforming C++ implementation may bomb
> > if you try to examine the bytes of an object via a char * resulting
> > from a cast; a conforming C implementation may not.
> >
> > C++ makes the following guarantees: (1) void * and char/unsigned char *
> > have the same representation; (2) Any pointer-to-object may be cast to
> > any other pointer-to-object and back again without chang in value.
> > However, it does not guarantee that the new pointer from (2) is a
> > valid, useable pointer in any other respect.
> >
> > There is no well-defined way to examine an object's bytes without an
> > explicit copy (via std::memcpy()),
>
> Perhaps the wording is badly chosen, or perhaps I didn't look at the right
> place, but I strongly doubt that this is really the intent of the standard for
> several reasons:
>
> - The standard guarantees (1.8.5) that POD types occupy contiguous bytes of
> storage.
>
> - In the paragraph that I mentioned previously there is a note that indicates
> that the intent is that the memory model of C++ is compatible with that of C.
>
> - The offsetof macro can still be used on POD types.
>
> Now, in the above there is no absolute proof of what I'm saying, and I don't
> really want to go through the standard at the moment, but it strongly suggests
> that what you can do with POD types in C++ is the same as what you can do in C.
> Another reason why I would not expect otherwise is that the C++ standards
> committee had a mandate to minimize the incompatibilities with C. Given that
> there are many C programs that do such tricks with casts it would be surprising
> if those tricks were not allowed in C++. Moreover, if such a change was really
> intended one would expect it to be mentioned in Annex C.
>
> If you, or anyone else, could give a precise quote from the standard that
> specifically forbids the previously mentioned use of POD types I would be glad
> to hear it, and would stand corrected. However, so far I am not
> convinced.
Specific disallowance is not necessary - instead, specific *allowance*
is. The Standard, in 5.2.10#7, says that reinterpret_cast<> may be
used to cast a pointer to one type to a pointer to another type, but
that except for the guarantee that it may be cast back without losing
the original value, the result is unspecified. Furthermore, a Note in
paragraph 3 of the same points out that it is entirely up to the
implementation as to whether or not the bit-representation changes.
So, for instance, if a pointer to a class-type and a pointer to char
have different representations/interpretations, it would be legal for
an implementation *not* to change the bit-pattern upon casting the
class-pointer to a char-pointer, since that would make it very easy to
cast back; but obviously any attempt to dereference the resulting
char * would produce undefined behavior.
There are no exceptions made for pointers to character type, and so
this applies to them as much as to any other pointer, despite the fact
that the actual representation is still a series of bytes (which is
what guarantees the std::memcpy() stuff we were talking about, but still
not anything about pointer casts.
And because there is no Standard Conversion from pointer-types to
pointers-to-character-types, any C-like or function-style cast will
have the same semantics as a reinterpret_cast<>.
Now, I agree that it is very likely that any C++ compiler currently in
existence probably allows a cast to char* in the way that you or I
might expect in C. However, there is no *guarantee* that C++
compilers will behave this way; therefore, such an action is not
portable.
BTW, the fact that the memory model is intended to be compatible with
C doesn't really have any direct bearance on the interpretation of
this cast. It only has relation to our std::memcpy() example.
Also, I doubt that the omission was unintentional, since much of the
wording in 5.2.10 and 4.10 is patterned after similar verbage in the C
standard. Now, since they were probably *looking* at the C standard
as a basis for their rules, it seems to me they must have made a
*conscious* decision to remove the guarantees for the result of
casting a pointer to pointer-to-char.
Any comments from comp.std.c++?
Micah
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Valentin.Bonnard@free.fr (Valentin Bonnard)
Date: Wed, 24 Oct 2001 23:07:04 GMT Raw View
> Now, since they were probably *looking* at the C standard
> as a basis for their rules, it seems to me they must have made a
> *conscious* decision to remove the guarantees for the result of
^^^^^^^^^
> casting a pointer to pointer-to-char.
Of course not; your faith in the committee is too great.
reinterpret_cast is simply broken, and in a much more
practically important way than the theoretical problem
you describe.
-- VB
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Michiel Salters<Michiel.Salters@cmg.nl>
Date: Thu, 25 Oct 2001 19:10:04 GMT Raw View
In article <yu8r8rsyjy7.fsf_-_@mcowan-linux.transmeta.com>, Micah Cowan says...
>"Bart Kowalski" <me@nospam.com> writes:
>(The text Bart quotes is mine)
>
>> > C makes the additional guarantee that you may cast a pointer to T to
>> > pointer to character-type, and examine the bytes that wise (without
>> > necessitating any copying). In C++ this guarantee is *not* made, and
>> > the equivalent cast is a reinterpret_cast<>, whose mapping is always
>> > implementation-defined. IOW, a conforming C++ implementation may bomb
>> > if you try to examine the bytes of an object via a char * resulting
>> > from a cast; a conforming C implementation may not.
>> >
>> > C++ makes the following guarantees: (1) void * and char/unsigned char *
>> > have the same representation; (2) Any pointer-to-object may be cast to
>> > any other pointer-to-object and back again without chang in value.
>> > However, it does not guarantee that the new pointer from (2) is a
>> > valid, useable pointer in any other respect.
Not in 5.2, but 9.2 says
"A pointer to a POD-struct object, suitably converted using a reinterpret_cast,
points to its initial member (or if that member is a bit-field, then to the
unit in which it resides) and vice versa. "
>> > There is no well-defined way to examine an object's bytes without an
>> > explicit copy (via std::memcpy()),
9.2 seems to differ.
>> Perhaps the wording is badly chosen, or perhaps I didn't look at the right
>> place, but I strongly doubt that this is really the intent of the standard for
>> several reasons:
>>
>> - The standard guarantees (1.8.5) that POD types occupy contiguous bytes of
>> storage.
Sure, because it wouldn't be really useful if you got only a pointer to the
first byte & couldn't find the other bytes.
>> - In the paragraph that I mentioned previously there is a note that indicates
>> that the intent is that the memory model of C++ is compatible with that of C.
>>
>> - The offsetof macro can still be used on POD types.
The C++ committee just spent 3 hours discussing this. We currently think you
should be able to do so, even if you define a global operator&(POD&). But
not all implementers can handle that now. Jens Maurer has been so friendly
to provide an offsetof() implementation that works even if ::operator&
returns an int.
>> Now, in the above there is no absolute proof of what I'm saying, and I don't
>> really want to go through the standard at the moment, but it strongly suggests
>> that what you can do with POD types in C++ is the same as what you can do in C.
>> Another reason why I would not expect otherwise is that the C++ standards
>> committee had a mandate to minimize the incompatibilities with C. Given that
>> there are many C programs that do such tricks with casts it would be surprising
>> if those tricks were not allowed in C++. Moreover, if such a change was really
>> intended one would expect it to be mentioned in Annex C.
Yes- we've considered adding to that, but we didn't have any reason to. We
might get some other wording there about volatiles, but not about PODs.
>> If you, or anyone else, could give a precise quote from the standard that
>> specifically forbids the previously mentioned use of POD types I would be glad
>> to hear it, and would stand corrected. However, so far I am not
>> convinced.
>
>Specific disallowance is not necessary - instead, specific *allowance*
>is. The Standard, in 5.2.10#7, says that reinterpret_cast<> may be
>used to cast a pointer to one type to a pointer to another type, but
>that except for the guarantee that it may be cast back without losing
>the original value, the result is unspecified. Furthermore, a Note in
>paragraph 3 of the same points out that it is entirely up to the
>implementation as to whether or not the bit-representation changes.
>So, for instance, if a pointer to a class-type and a pointer to char
>have different representations/interpretations, it would be legal for
>an implementation *not* to change the bit-pattern upon casting the
>class-pointer to a char-pointer, since that would make it very easy to
>cast back; but obviously any attempt to dereference the resulting
>char * would produce undefined behavior.
That breaks 9.2.
>Also, I doubt that the omission was unintentional, since much of the
>wording in 5.2.10 and 4.10 is patterned after similar verbage in the C
>standard. Now, since they were probably *looking* at the C standard
>as a basis for their rules, it seems to me they must have made a
>*conscious* decision to remove the guarantees for the result of
>casting a pointer to pointer-to-char.
I don't know the history of that paragraph, so I can't judge that.
What I do know is that 5.2 describes which casts are possible, while
other paragraphs describe what they do.
Regards,
Michiel Salters
--
Michiel Salters
Consultant Technical Software Engineering
CMG Trade, Transport & Industry
Michiel.Salters@cmg.nl
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: Fri, 26 Oct 2001 06:12:48 GMT Raw View
Michiel Salters wrote:
>
> In article <yu8r8rsyjy7.fsf_-_@mcowan-linux.transmeta.com>, Micah Cowan says...
>
> >"Bart Kowalski" <me@nospam.com> writes:
> >(The text Bart quotes is mine)
> >
> >> > C makes the additional guarantee that you may cast a pointer to T to
> >> > pointer to character-type, and examine the bytes that wise (without
> >> > necessitating any copying). In C++ this guarantee is *not* made, and
> >> > the equivalent cast is a reinterpret_cast<>, whose mapping is always
> >> > implementation-defined. IOW, a conforming C++ implementation may bomb
> >> > if you try to examine the bytes of an object via a char * resulting
> >> > from a cast; a conforming C implementation may not.
> >> >
> >> > C++ makes the following guarantees: (1) void * and char/unsigned char *
> >> > have the same representation; (2) Any pointer-to-object may be cast to
> >> > any other pointer-to-object and back again without chang in value.
> >> > However, it does not guarantee that the new pointer from (2) is a
> >> > valid, useable pointer in any other respect.
>
> Not in 5.2, but 9.2 says
> "A pointer to a POD-struct object, suitably converted using a reinterpret_cast,
> points to its initial member (or if that member is a bit-field, then to the
> unit in which it resides) and vice versa. "
>
> >> > There is no well-defined way to examine an object's bytes without an
> >> > explicit copy (via std::memcpy()),
>
> 9.2 seems to differ.
Only if the first member has a char type, or is an array of char.
Otherwise, the same issue applies at a different level: how to convert
from a pointer to that member, into a pointer to a character type.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Micah Cowan <micah@cowanbox.com>
Date: Fri, 26 Oct 2001 06:13:58 GMT Raw View
Michiel Salters<Michiel.Salters@cmg.nl> writes:
> In article <yu8r8rsyjy7.fsf_-_@mcowan-linux.transmeta.com>, Micah Cowan says...
>
> >"Bart Kowalski" <me@nospam.com> writes:
> >(The text Bart quotes is mine)
> >
> >> > C makes the additional guarantee that you may cast a pointer to T to
> >> > pointer to character-type, and examine the bytes that wise (without
> >> > necessitating any copying). In C++ this guarantee is *not* made, and
> >> > the equivalent cast is a reinterpret_cast<>, whose mapping is always
> >> > implementation-defined. IOW, a conforming C++ implementation may bomb
> >> > if you try to examine the bytes of an object via a char * resulting
> >> > from a cast; a conforming C implementation may not.
> >> >
> >> > C++ makes the following guarantees: (1) void * and char/unsigned char *
> >> > have the same representation; (2) Any pointer-to-object may be cast to
> >> > any other pointer-to-object and back again without chang in value.
> >> > However, it does not guarantee that the new pointer from (2) is a
> >> > valid, useable pointer in any other respect.
>
> Not in 5.2, but 9.2 says
> "A pointer to a POD-struct object, suitably converted using a
> reinterpret_cast, points to its initial member (or if that member is
> a bit-field, then to the unit in which it resides) and vice versa. "
Okay, so *this* guarantees that the pointer at least points to its
initial member. However, 5.2 still specifically states that the value
is unspecified for any use other than casting back to its original type.
> >> > There is no well-defined way to examine an object's bytes without an
> >> > explicit copy (via std::memcpy()),
>
> 9.2 seems to differ.
I don't see how. 9.2 guarantees that the reinterpreted pointer still
points to the same place; however, it still doesn't say anything to
add or subtract from 5.2's unspecified result assertion.
Additionally, the text in 9.2 is talking about POD-structs only,
whereas we are talking about any data-type (I would like to at least
see non-aggregate PODs addressed). It also doesn't specify what
"suitably converted" means, which I would usually take to mean a
pointer to the correct type for its initial member: however, even if
it is not meant that way, it still doesn't mean that you can cast to
char* and use that in the way you expect.
> >So, for instance, if a pointer to a class-type and a pointer to char
> >have different representations/interpretations, it would be legal for
> >an implementation *not* to change the bit-pattern upon casting the
> >class-pointer to a char-pointer, since that would make it very easy to
> >cast back; but obviously any attempt to dereference the resulting
> >char * would produce undefined behavior.
>
> That breaks 9.2.
Right. Thanks for pointing that out.
> >Also, I doubt that the omission was unintentional, since much of the
> >wording in 5.2.10 and 4.10 is patterned after similar verbage in the C
> >standard. Now, since they were probably *looking* at the C standard
> >as a basis for their rules, it seems to me they must have made a
> >*conscious* decision to remove the guarantees for the result of
> >casting a pointer to pointer-to-char.
>
> I don't know the history of that paragraph, so I can't judge that.
> What I do know is that 5.2 describes which casts are possible, while
> other paragraphs describe what they do.
However, so far I have yet to see a paragraph that describes what a
cast from arbitrary pointer to char* does, after searching rather
thoroughly.
Micah
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]