Thread

Topic: pointer arithmetic and raw memory

Author: "William M. Miller" <william.m.miller@gmail.com>
Date: Wed, 30 Mar 2005 19:17:54 CST Raw View

kuyper@wizard.net wrote:
> Taken literally, this implies that there must be at least one byte
> after every object that the implementation can't use to allocate any
> other object of compatible alignment, because if it were used, a
> past-the-end pointer on the first object would compare equal to a
> pointer to the beginning of the second object.
>
> In practice, implementations routinely allocate objects back-to-back.
> The C committee recognised this fact and added the following words to
> C99: " ... or one is a pointer to one past the end of one array object
> and the other is a pointer to the start of a different array object
> that happens to immediately follow the first array object in the
> address space."
>
> I haven't got a copy of the recent updates to the C++ standard; if this
> issue hasn't already been addressed, I'd recommend making a similar
> change to the C++ standard.

This was dealt with in the resolution to
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#73,
which was part of TC1.  However, this is somewhat in flux; see
the final note in the discussion of
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232.

--
William M. (Mike) Miller | Edison Design Group, Inc.
william.m.miller@gmail.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com
Date: Wed, 23 Mar 2005 13:46:16 CST Raw View

Thomas Mang wrote:

> 5.7/5 says pointer arithmetic (other than 0/1) requires an array.
> Nothing in std::vector specifies it is an array

Well, yes and no.  It turns out that 20.4.1.1/5 tells us that the
default allocator provides "a pointer to the initial element of an
array of storage..." so unless you specifiy a custom allocator, you can
assume that the vector's underlying storage is indeed an array.

More generally, 20.1.5 Table 32 specifies that allocate() is required
to return a random-access iterator, which would imply that it must
return something on which "iterator arithmetic" is well-defined.
Further, the standard containers are allowed to assume that an
allocater's "pointer" type is a T*.  Since the only T*'s for which
"iterator arithmetic" is well defined are pointers into arrays, it
seems fair to conclude that any custom allocator that is suitable for
use with the standard containers must supply pointers into arrays.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: a9804814delthis@unet.univie.ac.at ("Thomas Mang")
Date: Wed, 23 Mar 2005 22:51:36 GMT Raw View

<johnchx2@yahoo.com> schrieb im Newsbeitrag
news:1111553376.870479.187650@z14g2000cwz.googlegroups.com...


Please consider, when evaluating my other reply, that I have now taken
notice of the consequences of 3.8/2, which is very important. I am, at the
moment, not sure though how that affects the issue especially in the light
of own written allocator. I will rethink about that later.


Thomas


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: wade@stoner.com
Date: Thu, 24 Mar 2005 15:36:23 CST Raw View

"Thomas Mang" wrote:
>
> Unless I am missing something that makes the current situation
defined as it
> was intended, my impression is pointer arithmetic should not be
restricted
> to arrays, but to all contiguous serieses of objects of the same
type.
> Your thoughts?
>

I'd prefer to say a contiguous block of currently allocated and
properly aligned memory.  That begs the question: what is a contiguous
block of memory?

I usually answer that question with the following tests.  This isn't
exactly what the standard says, but a compiler writer has to work
really hard to make this stuff wrong.

Without a runtime test, we know that:
 1) Any single-non-reference(snr) auto variable in scope has a
contiguous block of memory.
 2) Any snr global variable always has a contiguous block of memory.
 3) Any snr local static variable has a contiguous block of memory
during its "natural" lifetime (from whenever execution first passes
through the definition, until the "implicit" call to the destructor
completes at program termination).
 4) Any hunk of memory returned by malloc() or new() is contiguous.

Given T1* p1, T2* p2, that point at allocated/aligned memory of at
least the correct size, I think you can make a runtime test:

  void* v1 = (void*)(p1+1);
  void* v2 = (void*)(p2);
  bool contiguous = v1 == v2;

Even though these tests tell you when you can do pointer arithmetic, I
don't think it adequately tells you when you can safely use the
resulting pointers.

void OptimizeMe()
{
  int a = 1;
  int b = 2;
  int* pa = &a;
  if(pa+1 == &b)
  {
    pa[1] = 3;
    assert(b == 3);
  }
}

I would argue that the standard does not require the assert() to
succeed if reached.  An optimizing compiler can see that there is never
an assignment to b, or through a pointer derived from b, so the value
of b must still be 2.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: a9804814delthis@unet.univie.ac.at ("Thomas Mang")
Date: Fri, 25 Mar 2005 05:14:47 GMT Raw View

<johnchx2@yahoo.com> schrieb im Newsbeitrag
news:1111553376.870479.187650@z14g2000cwz.googlegroups.com...
> Thomas Mang wrote:
>
> > 5.7/5 says pointer arithmetic (other than 0/1) requires an array.
> > Nothing in std::vector specifies it is an array
>
> Well, yes and no.  It turns out that 20.4.1.1/5 tells us that the
> default allocator provides "a pointer to the initial element of an
> array of storage..." so unless you specifiy a custom allocator, you can
> assume that the vector's underlying storage is indeed an array.

I like the wording "array of storage" because I don't know what that is.
What are the element types of that array? How does this array survive the
overwritting of the storage returned by the allocator inside vector?

Although I really do not want to throw this in additionally, note that I
don't think a vector is always required to obtain memory using the
allocator:
http://groups-beta.google.com/group/comp.lang.c++.moderated/msg/0f81073aead6
21de

But I admit this turns then into becoming pedantic to the square.

> More generally, 20.1.5 Table 32 specifies that allocate() is required
> to return a random-access iterator, which would imply that it must
> return something on which "iterator arithmetic" is well-defined.
> Further, the standard containers are allowed to assume that an
> allocater's "pointer" type is a T*.  Since the only T*'s for which
> "iterator arithmetic" is well defined are pointers into arrays, it
> seems fair to conclude that any custom allocator that is suitable for
> use with the standard containers must supply pointers into arrays.

Here you have raised some interesting aspects. Assume the default allocator
returns a pointer to an "array", and everything is fine (not sure about
that, but honestly I don't really know). Then I ask myself how one is
supposed to write one's own allocator. An allocator can clearly not create
an array of objects of type T, it can only allocate storage. But how is,
under these circumstances, the allocate() - member function supposed to
return a pointer to T that can be used for random access? The only way I see
would be by creating an array......

Please also note that I think even if for std::vector the guarantee can be
found somewhere between the lines, the problem still remains for user
defined allocators substituting for example new/delete. It can hardly have
been the intent to forbid these implemenations.
So the fix need not be applied to the Standard Library, it's 5.7/5 that
needs in my eyes a fix.

Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "Thomas Mang" <a9804814delthis@unet.univie.ac.at>
Date: Sat, 26 Mar 2005 00:43:32 CST Raw View

"Marc Schoolderman" <squell@alumina.nl> schrieb im Newsbeitrag
news:4241BCA8.3020806@alumina.nl...
> Thomas Mang wrote:
>

> >>And since 3.8/1 says that the lifetime of [an array] begins as soon as
> >>storage with the proper size and alignment for it is obtained - there's
> >>your array.
> > Here you have found an interesting para. Indeed, I agree, allocating
memory
> > seems to create a char[] (or an int[] or POD[]).
>
> Or "non-POD[]"! I think this is very significant, because this implies
> that the lifetime of a complete object is different from its subobjects.
>
> After all, the array lifetime starts whenever its storage is obtained,
> but the lifetimes of its non-POD subobjects only start as soon as their
> individual constructor calls have completed.
>

I have snipped everything else for the moment, because I want to put the
whole focus on 3.8/2 you have found:

"the lifetime of an array .... starts as soon as storage with proper size
and alignment is obtained, ...".

So indeed, allocating raw memory creates an array, concluding the complete
object  - the array - comes to life before its subobjects come to life.
Neat. Apparently, I can also have a (non-empty) array, but not a single
array element (subobject). And pointer arithmetic will be defined if there
are elements of an array.

So taking that into account, you are right, we have to focus on lifetime
issues.
The question that arises now is the same we have hit already earlier in this
thread: What is "reusing storage" that would end the lifetime of that array,
or of one of its subobjects?

Your opinion? [I don't really have one at the moment].

Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Sat, 26 Mar 2005 06:44:57 GMT Raw View

wade@stoner.com wrote:

> void OptimizeMe()
> {
>   int a = 1;
>   int b = 2;
>   int* pa = &a;
>   if(pa+1 == &b)
>   {
>     pa[1] = 3;
>     assert(b == 3);
>   }
> }

> I would argue that the standard does not require the assert() to
> succeed if reached.  An optimizing compiler can see that there is never
> an assignment to b, or through a pointer derived from b, so the value
> of b must still be 2.

I see two issues in the code snippet here, one is the pointer
comparison, the other are the aliassing rules.

The equality comparison relies on 5.10/1, which says that the comparison
will be valid if (and only if) both pointers point to the same object,
or to the end of the same array (This is easily reduced to "if they
point into the same array", by the way!)

So IF the block is entered, pa[1] and b are both lvalues that refer to
the same object, not restricted by aliassing (3.10/15), so the compiler
is not allowed to optimize this. Afaict, the assert will hold.

As a sidenote, on the architectures I'm familiar with, the stack grows
downward, so it seems more intuitive to me that &a == &b + 1! :)

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net
Date: Sun, 27 Mar 2005 18:33:08 CST Raw View

Marc Schoolderman wrote:
> wade@stoner.com wrote:
>
> > void OptimizeMe()
> > {
> >   int a = 1;
> >   int b = 2;
> >   int* pa = &a;
> >   if(pa+1 == &b)
> >   {
> >     pa[1] = 3;
> >     assert(b == 3);
> >   }
> > }
>
> > I would argue that the standard does not require the assert() to
> > succeed if reached.  An optimizing compiler can see that there is
never
> > an assignment to b, or through a pointer derived from b, so the
value
> > of b must still be 2.
>
> I see two issues in the code snippet here, one is the pointer
> comparison, the other are the aliassing rules.
>
> The equality comparison relies on 5.10/1, which says that the
comparison
> will be valid if (and only if) both pointers point to the same
object,
> or to the end of the same array (This is easily reduced to "if they
> point into the same array", by the way!)

It says "one past the end of the same array", in other words, a pointer
like pa+1 (for the purposes of these rules, 'a' is treated as a
1-element array of int). That is not a pointer "into the same array".

Taken literally, this implies that there must be at least one byte
after every object that the implementation can't use to allocate any
other object of compatible alignment, because if it were used, a
past-the-end pointer on the first object would compare equal to a
pointer to the beginning of the second object.

In practice, implementations routinely allocate objects back-to-back.
The C committee recognised this fact and added the following words to
C99: " ... or one is a pointer to one past the end of one array object
and the other is a pointer to the start of a different array object
that happens to immediately follow the first array object in the
address space."

I haven't got a copy of the recent updates to the C++ standard; if this
issue hasn't already been addressed, I'd recommend making a similar
change to the C++ standard.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Tue, 29 Mar 2005 00:30:56 GMT Raw View

kuyper@wizard.net wrote:

> It says "one past the end of the same array", in other words, a pointer
> like pa+1 (for the purposes of these rules, 'a' is treated as a
> 1-element array of int). That is not a pointer "into the same array".

I meant 'into the same array' very loosely. Compare with 5.9.

And also to emphasize that in this example, assuming the memory
addresses are identical, the compiler is IMO allowed to consider "a+1" a
pointer to the end of an array, and "b" a pointer to a different object,
and let the comparison *fail* for this reason. I can imagine a compiler
using 'bounded pointers' could do just that.

> Taken literally, this implies that there must be at least one byte
> after every object that the implementation can't use to allocate any
> other object of compatible alignment, because if it were used, a
> past-the-end pointer on the first object would compare equal to a
> pointer to the beginning of the second object.

But couldn't it be that an implementation is allowed to formally obtain
the storage for, say, auto objects, by creating an array of them? As far
as I know the standard doesn't really say much about this.

> C99: " ... or one is a pointer to one past the end of one array object
> and the other is a pointer to the start of a different array object
> that happens to immediately follow the first array object in the
> address space."

This is new to me - this means C99 has a notion of contiguous elements
that are not part of the same array?

Frankly, I don't think this is a good idea.

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net
Date: Mon, 28 Mar 2005 22:24:35 CST Raw View

Marc Schoolderman wrote:
> kuyper@wizard.net wrote:
>
> > It says "one past the end of the same array", in other words, a
pointer
> > like pa+1 (for the purposes of these rules, 'a' is treated as a
> > 1-element array of int). That is not a pointer "into the same
array".
>
> I meant 'into the same array' very loosely. Compare with 5.9.

I never noticed that before; we all "know" that a pointer one past the
end of an array compares greater than a pointer into that array, but I
can't find any case in 5.9 that covers such a comparison. I am very
fmailiar with both the C and C++ standards, and the C standard covers
that as a seperate special case (6.8.5p5), so I didn't even notice that
the C++ standard fails to do so.

However, I doubt that this is a deliberate difference between the
standards; I think it constitutes a defect in the C++ standard. I doubt
that it was intended to be the case that a pointer one past the end of
an array would count as pointing into the array.

> And also to emphasize that in this example, assuming the memory
> addresses are identical, the compiler is IMO allowed to consider
"a+1" a
> pointer to the end of an array, and "b" a pointer to a different
object,
> and let the comparison *fail* for this reason. I can imagine a
compiler
> using 'bounded pointers' could do just that.

Yes, that's one way to do it, but 'bounded pointers' are an example of
fat pointers, and as such are an expensive approach to use. Without
their use, an implementation would have to suffer the penalty described
in the next paragraph:

.
> But couldn't it be that an implementation is allowed to formally
obtain
> the storage for, say, auto objects, by creating an array of them? As
far
> as I know the standard doesn't really say much about this.

Yes, but that doesn't help. Those auto objects are themselves arrays,
or contain arrays. Even if they don't, for purposes of pointer
arithmentic, scalar objects count as one-element arrays, and would
therefore have to have empty spaces after them, even if they were also
allocated out of a single larger array.

.
> > C99: " ... or one is a pointer to one past the end of one array
object
> > and the other is a pointer to the start of a different array object
> > that happens to immediately follow the first array object in the
> > address space."
>
> This is new to me - this means C99 has a notion of contiguous
elements
> that are not part of the same array?

Yes, they did keep the standard in touch with the reality of actual
implementations.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Tue, 29 Mar 2005 21:46:03 GMT Raw View

kuyper@wizard.net wrote:

> I never noticed that before; we all "know" that a pointer one past the
> end of an array compares greater than a pointer into that array, but I
> can't find any case in 5.9 that covers such a comparison. I am very
> fmailiar with both the C and C++ standards, and the C standard covers
> that as a seperate special case (6.8.5p5), so I didn't even notice that
> the C++ standard fails to do so.

Isn't this covered by 5.9 paragraph 2, second-to-last bullet?

> However, I doubt that this is a deliberate difference between the
> standards; I think it constitutes a defect in the C++ standard. I doubt
> that it was intended to be the case that a pointer one past the end of
> an array would count as pointing into the array.

I think we have a little misunderstanding. Formally, every array of size
N has a set of N+1 addresses that are defined for it. When I say
'into the array' loosely, I mean a pointer that is drawn from that
set. My bad, I hope this rectifies it.

>>and let the comparison *fail* for this reason. I can imagine a compiler
>>using 'bounded pointers' could do just that.
> Yes, that's one way to do it, but 'bounded pointers' are an example of
> fat pointers, and as such are an expensive approach to use. Without
> their use, an implementation would have to suffer the penalty described
> in the next paragraph:

Of course no efficient implementation would use 'bounded pointers', but
considering the relative unsafety of C and C++, such implementations do
have a certain merit. For example during debugging and development.

What I'm also implying is that in my view, assuming the memory addresses
are equal, an implementation is allowed to let the comparison fail, but
it is not *required* to. I see the comparison as the point that
arbitrates whether or not two pointers refer to the same object, not the
declarations of the objects the pointer values were derived from.

I haven't seen a clause that says a 'past-the-end' pointer value has to
be distinct from a pointer to another object, I *have* seen a footnote
that suggests otherwise (near 5.9).

The result is that (portably) this comparison is a pretty meaningless
thing to do and is best avoided. This sounds very logical to me.

> Yes, but that doesn't help. Those auto objects are themselves arrays,
> or contain arrays. Even if they don't, for purposes of pointer
> arithmentic, scalar objects count as one-element arrays, and would
> therefore have to have empty spaces after them, even if they were also
> allocated out of a single larger array.

I don't fully agree. I've just had the discussion with Thomas Mang which
converged on 3.8. If the scalar objects are fully contiguous, it
wouldn't be too far-fetched to consider them 'allocated storage' and
constituting an array.

>>This is new to me - this means C99 has a notion of contiguous
>>elements that are not part of the same array?
> Yes, they did keep the standard in touch with the reality of actual
> implementations.

But how for example does this impact pointer arithmetic? I only have the
C99 public draft, and I don't see any word about it in there.

For example;

int a, b, c;
if( (&a+1 == &b) && (&b+1 == &c)) {
     assert(&a+2 == &c);        // defined behaviour?
}

Note that it's NOT my intention to debate the C99 standard in this
newsgroup, since I don't know much about it.

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: wade@stoner.com
Date: Tue, 29 Mar 2005 20:37:46 CST Raw View

kuyper@wizard.net wrote:

> In practice, implementations routinely allocate objects back-to-back.
> The C committee recognised this fact and added the following words to
> C99: " ... or one is a pointer to one past the end of one array
object
> and the other is a pointer to the start of a different array object
> that happens to immediately follow the first array object in the
> address space."

A = both null
B = both same object
C = both 1 past end of same array

C++
  pointers_equal = A || B || C;
C
  pointers_equal = (A || B || C) || (B && C);

I'm having trouble finding the point at which the truth tables for the
two expressions are different.  Some people read some kind of
exlusive-or into the standard, but I don't see it.  Somewhere else
(4.10/1) the C++ standard says
  !(A&&B) && !(A&&C)
but nowhere, AFAICT does it say !(B&&C).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "C. M. Heard" <heard@pobox.com>
Date: Tue, 29 Mar 2005 20:36:26 CST Raw View

James Kuyper wrote:
> Marc Schoolderman wrote:
> > The equality comparison relies on 5.10/1, which says that the comparison
> > will be valid if (and only if) both pointers point to the same object,
> > or to the end of the same array (This is easily reduced to "if they
> > point into the same array", by the way!)
>
> It says "one past the end of the same array", in other words, a pointer
> like pa+1 (for the purposes of these rules, 'a' is treated as a
> 1-element array of int). That is not a pointer "into the same array".
>
> Taken literally, this implies that there must be at least one byte
> after every object that the implementation can't use to allocate any
> other object of compatible alignment, because if it were used, a
> past-the-end pointer on the first object would compare equal to a
> pointer to the beginning of the second object.
>
> In practice, implementations routinely allocate objects back-to-back.
> The C committee recognised this fact and added the following words to
> C99: " ... or one is a pointer to one past the end of one array object
> and the other is a pointer to the start of a different array object
> that happens to immediately follow the first array object in the
> address space."

Looking ISO/IEC 14882:1998, 5.7/7, at the end of footnote 75, I see:

 When viewed in this way, an implementation need only provide one extra
 byte (which might overlap another object in the program) just after the
 end of the object in order to satisfy the "one past the last element"
 requirements.

So, it was clearly intended that the actual behavior be as described in
C99.

> I haven't got a copy of the recent updates to the C++ standard; if this
> issue hasn't already been addressed, I'd recommend making a similar
> change to the C++ standard.

Yep.

//cmh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net
Date: Tue, 29 Mar 2005 23:34:54 CST Raw View

Marc Schoolderman wrote:
> kuyper@wizard.net wrote:
>
> > I never noticed that before; we all "know" that a pointer one past
the
> > end of an array compares greater than a pointer into that array,
but I
> > can't find any case in 5.9 that covers such a comparison. I am very
> > fmailiar with both the C and C++ standards, and the C standard
covers
> > that as a seperate special case (6.8.5p5), so I didn't even notice
that
> > the C++ standard fails to do so.
>
> Isn't this covered by 5.9 paragraph 2, second-to-last bullet?

You're right. That's embarrassing. I think it was because I my wife was
competing with the C++ standard for my attention at the time I was
writing that message (she ended up winning :-).

> > However, I doubt that this is a deliberate difference between the
> > standards; I think it constitutes a defect in the C++ standard. I
doubt
> > that it was intended to be the case that a pointer one past the end
of
> > an array would count as pointing into the array.
>
> I think we have a little misunderstanding. Formally, every array of
size
> N has a set of N+1 addresses that are defined for it. When I say
> 'into the array' loosely, I mean a pointer that is drawn from that
> set. My bad, I hope this rectifies it.

Nope, no misunderstanding. I knew that this is precisely what you
meant. I just think that this "loose" definition of "into the array" is
contrary to normal usage, and also a bad idea.

> > Yes, but that doesn't help. Those auto objects are themselves
arrays,
> > or contain arrays. Even if they don't, for purposes of pointer
> > arithmentic, scalar objects count as one-element arrays, and would
> > therefore have to have empty spaces after them, even if they were
also
> > allocated out of a single larger array.
>
> I don't fully agree. I've just had the discussion with Thomas Mang
which
> converged on 3.8. If the scalar objects are fully contiguous, it
> wouldn't be too far-fetched to consider them 'allocated storage' and
> constituting an array.

The fact that they're contiguous doesn't change the fact that they're
also single scalar objects. It's routinely the case that larger objects
contain sub-objects. As single scalar objects, they are treated for
purposes of pointer arithmetic as pointers at the first element of a
1-element array. Therefore, if a pointer one past the end of an array
is prohibited from comparing equal to a pointer to any other object,
you have to either have fat pointers, or padding after the end of every
object.

Consider

int twod[3][4];

The elements of twod are twod[i] 0<=i<3. In this case, the standard
requires that those elements be allocated contiguously, with no padding
between them. However, each of those elements is also a sub-array.
Therefore, if it were prohibited for a pointer one past the end of an
array to compare equal to a pointer to any other object, twod[0]+4
would be prohibited from comparing equal to twod[1]. I don't know of
any easy way to do that with a reasonably conventional implementation
of pointers. The only way I can see is by the use of fat pointers:
pointers that contain not only the address, but also enough additional
information to allow pointers that refer to the same address to be
distinguished from each other.

> >>This is new to me - this means C99 has a notion of contiguous
> >>elements that are not part of the same array?
> > Yes, they did keep the standard in touch with the reality of actual
> > implementations.
>
> But how for example does this impact pointer arithmetic? I only have
the
> C99 public draft, and I don't see any word about it in there.

The final standard contained a number of significant changes from the
final public draft version. C99 section 6.5.9p6: "Two pointers compare
equal if and only if [several other alternatives], or one is a pointer
to one past the end of one array object and the other is a pointer to
the start of a different array object that happens to immediately
follow the first array object in the address space.

> For example;
>
> int a, b, c;
> if( (&a+1 == &b) && (&b+1 == &c)) {
>      assert(&a+2 == &c);        // defined behaviour?
> }

It's not defined behavior. 6.5.9p6 only covers equality operators. The
description of pointer+integer only allows the result to point into or
one past the end of the same array that 'pointer' itself points at.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nagle@animats.com (John Nagle)
Date: Thu, 31 Mar 2005 01:18:03 GMT Raw View

C. M. Heard wrote:

> James Kuyper wrote:
>
>>Marc Schoolderman wrote:

>>In practice, implementations routinely allocate objects back-to-back.
>>The C committee recognised this fact and added the following words to
>>C99: " ... or one is a pointer to one past the end of one array object
>>and the other is a pointer to the start of a different array object
>>that happens to immediately follow the first array object in the
>>address space."

    That can break some forms of garbage collection.

    What does "Microsoft Managed C++" do?

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nospam@pop.ucsd.edu ("Thomas Mang")
Date: Tue, 8 Mar 2005 20:34:30 GMT Raw View

"msalters" <Michiel.Salters@logicacmg.com> schrieb im Newsbeitrag
news:1109859186.935261.201880@f14g2000cwb.googlegroups.com...
> "Thomas Mang" wrote:
> > Just another issue (how the whole idea derived from):
> >
> >
> > Suppose you write a memory pool - one that preallocates some chunk of
> memory
> > and assigns it to objects if they are created (overloading new).
> >
> > Then my impression is:
> >
> > situation 1) I use ::operator new to obtain the memory.
> >
> > Then indexing into the chunk is not possible, except indeces 0 and 1
> (1 if
> > there is an element at index 0) because there are no arrays, and
> adding more
> > than 1 is undefined.
> > However, I can access the first free spot by always incrementing a
> pointer
> > by one - e.g. jumping from object to the next object(all objects
> represent a
> > "one-element-array", and therefore increment can be used). This
> requires all
> > objects to be of the same type.
>
> No. You have a valid char* pointer, enough memory, so the only concern
> is whether the char* is properly aligned for the type you're creating.
> This will always be the case if you're creating only objects of a
> single type, but that's because they will be objects with the same
> sizeof() and thus the same alignment. Any other type with the same
> sizeof() can be substituted.
>
> > Casting to char* etc. and using that for indexing is not possible,
> there are
> > formally no char-elements / no char-array.
>
> Huh? I've got no idea what you're talking about. You can use char* for
> indexing; a char* can hold any address.

Sorry for answering late.

I am talking about the fact there was never ever a char[] created. A char*
has the same representation as a void*, but I fail to see how that matters
here.
The para about pointer arithmetic (5.7/5) clearly requires pointers to
elements of an array. Where is the char-array? I know I can treat the bits
in the memory as chars, but memory is not a char-array.

>
> > situation 2) I allocate the memory using char[]
> >
> > Is indexing into that array [sizeof(T) * index] possible? I don't
> think so,
> > because when the memory is overwritten, the lifetime of the char ends
> > (although the bits still are a valid char) - and therefore I don't
> have
> > elements of an array any more.
>
> Wrong, for a number of reasons. To name a few: lifetime doesn't matter
> for
> chars, any memory can be accessed by chars, you're not using the chars
> but
> instead the storage they used to occupy.

I have destroyed the char-array by reusing the storage of the chars. 3.8/4
says so that the lifetime ends.
Once the char[] is lost, I fail to see how pointer arithmetic is valid any
more. In my opinion, the fact that the bits still make up a valid char is
irrelevant.

Suppose an implementation keeps internally a table of created arrays, when
they are still valid and so on and terminates the program whenever pointer
arithmetic (other than 0/1) is applied to a pointer pointing not to an
address belonging to that table, even if it is a char*. Is there anything
that forbids such an implementation? Is there anything that says memory can
be treated as a char[]?

To add something to the vector-example:
The more I think about it, the more I am convinced a DR is needed. There is
no array of objects, so formaly pointer arithmetic is undefined behavior.
One can discuss if the expression used in 23.3.4/1 make the behavior valid
for these special cases, but I think it doesn't matter.
Suppose the expressions make it valid (personally, I don't think). Then the
only thing you can do is adding an offset < vec.size(). Since n is required
to be < vec.size(), it means you cannot compute the one-past-the-end
address. That means something like:
&vec[0] + vec.size()
is undefined behavior. I see nothing wrong in an implementation that chokes
on this and quits (remember, we could increment the pointer pointing to the
last vector element by one, but we cannot compute it by pointers to other
elements, since again there is no array). But probably exactly this
expression will be used in conjunction with older libraries. Finally, the
expression does not cover subtracting a value from a pointer or subtracting
a pointer from a pointer to get the number of elements.

The problem seems not to arise in 23.2.4/1 (it says "the elements of a
vector are stored contiguously"), it seems to come from the pointer
arithmetic para which explicitly requires arrays.

Maybe I am ways too pedantic, but maybe not.

Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Wed, 9 Mar 2005 14:49:50 GMT Raw View

Thomas Mang wrote:

> The para about pointer arithmetic (5.7/5) clearly requires pointers to
> elements of an array. Where is the char-array? I know I can treat the bits
> in the memory as chars, but memory is not a char-array.

Memory consists of one or more contiguous series of bytes (1.7).

At other places you can read that an uchar is a byte, and that objects
have object representations that are sequences of uchars, and so on.

> Suppose an implementation keeps internally a table of created arrays, when
> they are still valid and so on and terminates the program whenever pointer
> arithmetic (other than 0/1) is applied to a pointer pointing not to an
> address belonging to that table, even if it is a char*. Is there anything
> that forbids such an implementation?

This gets very close to a protected memory model, which is in fact what
a lot of systems implement.

> The problem seems not to arise in 23.2.4/1 (it says "the elements of a
> vector are stored contiguously"), it seems to come from the pointer
> arithmetic para which explicitly requires arrays.

You're right. Hence to satisfy the identity mapping, references returned
by vec[n] MUST refer to elements of an array. Simple as that.

Why would it be a 'special case'? There are lots of places where the
standard explains parts of the language/library in terms of other parts.

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Fri, 11 Mar 2005 17:35:16 GMT Raw View

"Marc Schoolderman" <squell@alumina.nl> schrieb im Newsbeitrag
news:422E62CD.9060105@alumina.nl...
> Thomas Mang wrote:
>
> > The para about pointer arithmetic (5.7/5) clearly requires pointers to
> > elements of an array. Where is the char-array? I know I can treat the
bits
> > in the memory as chars, but memory is not a char-array.
>
> Memory consists of one or more contiguous series of bytes (1.7).

True, but how does it matter?

>
> At other places you can read that an uchar is a byte, and that objects
> have object representations that are sequences of uchars, and so on.

Still they are not a C++ - array. Non-empty contiguous serieses of objects
are not automatically an array.

>
> > Suppose an implementation keeps internally a table of created arrays,
when
> > they are still valid and so on and terminates the program whenever
pointer
> > arithmetic (other than 0/1) is applied to a pointer pointing not to an
> > address belonging to that table, even if it is a char*. Is there
anything
> > that forbids such an implementation?
>
> This gets very close to a protected memory model, which is in fact what
> a lot of systems implement.

Yes, but note the important extension to distinguish whether an address
belongs to an array or not.
Suppose this simple snippet:

// 1
char* c = new char[10 * sizeof(T)];

// 2
for (int i = 0; i < 10; ++i)
new (c + i*sizeof(T)) T;

// 3
T* p = (T*) c;

// 4
p + 8;

and the compiler translates into this code:

// 1
char*c = new char[10 * sizeof(T)];
arrayAddresses.addArrayRange(c, c + 10* sizeof(T));

// 2
for (int i = 0; i < 10; ++i)
{
new (c + i*sizeof(T)) T;
arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c + i*sizeof(T));
}

// 3
T* p = (T*) c;

// 4
if (! arrayAddresses.addressBelongsToArrayAndTargetBelongsToArray(p, p + 8))
abort();
p + 8;

Is this implementation forbidden?
Same of course for creating contiguous serieses of objects using placement
new etc. (what std::vector probably does internally) where an address-range
was never added to arrayAddresses, because obtaining memory by a call to
placement new is very different from creating an array.

>
> > The problem seems not to arise in 23.2.4/1 (it says "the elements of a
> > vector are stored contiguously"), it seems to come from the pointer
> > arithmetic para which explicitly requires arrays.
>
> You're right. Hence to satisfy the identity mapping, references returned
> by vec[n] MUST refer to elements of an array. Simple as that.

I disagree. First, I think the para is invalid because the expression used
therein is invalid an there is no wording 5.7/5 is overruled. Second, even
if no explicit overruling is needed, then it does not follow you have an
array. Fulfilling only the limited operations guaranteed by that para seems
to be enough for me. An implementation that terminates on calculating the
past-the-end pointer value seems to be perfectly legal.

And of course, for user defined types only _simulating_ arrays (meaning
identical memory layout) and allocators and so on the problem remains.

Again, let me emphasize I am not questioning the intent of either
std::vector in combination with plain pointers, or doing some pointer
arithmetic within an block of allocated memory. I want to express that IMHO
all the current usage of this is undefined behavior because 5.7/5 is
violated, because that explicitly requires arrays. I could not find any para
in the STandard permitting this, not even for char*.

Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Sat, 12 Mar 2005 16:18:39 GMT Raw View

Thomas Mang wrote:

> Still they are not a C++ - array. Non-empty contiguous serieses of objects
> are not automatically an array.

I think I've found something. 1.8/1 says "An object is a region of
storage". So, an array object is (using this definition) simply a region
of storage containing contiguously allocated sub-objects.

But anyway, I think this is a moot issue. The only standard way to get a
contiguous series of objects in C++ is by creating an array! Anything
else will at least involve a type cast, which is a story in itself.

> Suppose this simple snippet:
  ..
> new (c + i*sizeof(T)) T;
> arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c + i*sizeof(T));

I assume you agree this violates all common sense but are looking for
the standard to back you up.

Given 3.8/2; "the lifetime of an array object or of an object of POD
type starts as soon as storage [...] is obtained and ends when the
storage which [it] occupies is reused or released."

The most convincing answer is that your statement is re-using storage of
sub-objects of the array, not the storage of the array object proper, so
the lifetime of the array itself is not affected.

Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
'non-POD class type'. So the list of undefined operations doesn't apply
to arrays and POD objects regardless.

However, you've convinced me that 3.8 is confusing at best. I'm unsure
what "reusing storage" actually means.

>>>The problem seems not to arise in 23.2.4/1 (it says "the elements of a
>>>vector are stored contiguously"), it seems to come from the pointer
>>>arithmetic para which explicitly requires arrays.
>>You're right. Hence to satisfy the identity mapping, references returned
>>by vec[n] MUST refer to elements of an array. Simple as that.
> I disagree. First, I think the para is invalid because the expression used
> therein is invalid an there is no wording 5.7/5 is overruled.

I don't see how a library specification would overrule the language.

vector<T>::operator[] is a function returning an lvalue of T. The &
operator will make it a pointer to T. So the identity mapping is merely
making a statement about how two pointers obtained through this function
are related. And the only way they can be related that way is if those
pointers point to elements of an array.

vector<> is already constrained by iterator invalidation rules and
complexity guarantees so that it can only (realistically) be implemented
by having it maintain an array privately, with its [] operator returning
references to objects inside that private array.

> Fulfilling only the limited operations guaranteed by that para seems
> to be enough for me. An implementation that terminates on calculating the
> past-the-end pointer value seems to be perfectly legal.

It can't terminate and obey the identity. Assume vec is a vector<int>
with size() > 0.

     int* end = &vec[vec.size()-1] + 1;

is valid because we're incrementing a pointer past an object. By
applying the identity of 23.2.4 to &vec[], and simplifying:

     int* end = &vec[0] + vec.size();

Which must therefore be valid as well.

Of course, &vec[vec.size()] is still undefined because the identity
doesn't hold for n >= size().

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "Thomas Mang" <nospam@nospam.ucar.edu>
Date: Sat, 12 Mar 2005 14:19:40 CST Raw View

"Marc Schoolderman" <squell@alumina.nl> schrieb im Newsbeitrag
news:423305A9.5010401@alumina.nl...
> Thomas Mang wrote:
>
> > Still they are not a C++ - array. Non-empty contiguous serieses of
objects
> > are not automatically an array.
>
> I think I've found something. 1.8/1 says "An object is a region of
> storage". So, an array object is (using this definition) simply a region
> of storage containing contiguously allocated sub-objects.
>
> But anyway, I think this is a moot issue. The only standard way to get a
> contiguous series of objects in C++ is by creating an array! Anything
> else will at least involve a type cast, which is a story in itself.

I am not following you here. std::vector does, internally, most likely never
create an array, still the objects are laid out in a contiguous series.

>
> > Suppose this simple snippet:
>   ..
> > new (c + i*sizeof(T)) T;
> > arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c +
i*sizeof(T));
>
> I assume you agree this violates all common sense but are looking for
> the standard to back you up.

Honestly, I think this very^very unlikely to hit in practise, but is it
forbidden?
Please note I have never said it is a problem in practical code, I am saying
it is a problem in the current wording of the standard.

And BTW, an implementation that - in the _most strict_ mode - tells me about
everything that yields undefined behavior, well I think I wouldn't mind at
all against such an implementation.

>
> Given 3.8/2; "the lifetime of an array object or of an object of POD
> type starts as soon as storage [...] is obtained and ends when the
> storage which [it] occupies is reused or released."
>
> The most convincing answer is that your statement is re-using storage of
> sub-objects of the array, not the storage of the array object proper, so
> the lifetime of the array itself is not affected.

No, the array itself counts as one object. If I reuse only one byte of that
object, or all bytes, does not seem to matter. I am reusing the storage to
assign it to a newly created object of type T. That ends the lifetime of the
array, thus all pointer arithmetic into that dead array seems like undefined
behavior to me.
I also think delete [] the "array" is undefined behavior, since the array
does not exist any more. But I am not sure about that. What do others think
about it?

>
> Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
> 'non-POD class type'. So the list of undefined operations doesn't apply
> to arrays and POD objects regardless.
>
> However, you've convinced me that 3.8 is confusing at best. I'm unsure
> what "reusing storage" actually means.

Probably reusing the storage in a write-matter other than the object type it
was created with. But good point, that could need clarification too.
Note however my point is not affecting what can be done with the
char-subobjects (that's pretty clear), I am saying because the array is
destroyed, the pointer arithmetic yields undefined behavior.

>
> >>>The problem seems not to arise in 23.2.4/1 (it says "the elements of a
> >>>vector are stored contiguously"), it seems to come from the pointer
> >>>arithmetic para which explicitly requires arrays.
> >>You're right. Hence to satisfy the identity mapping, references returned
> >>by vec[n] MUST refer to elements of an array. Simple as that.
> > I disagree. First, I think the para is invalid because the expression
used
> > therein is invalid an there is no wording 5.7/5 is overruled.
>
> I don't see how a library specification would overrule the language.

Well, by clearly contradicting another para? Wouldn't be the first diverging
wording.

General question to the language lawyers: Is the wording of 23.2.4/1
extending the para about pointer arithmetic (by meaning "everything in the
Standard is correct, so it can't contradict, it must extend"), or is it
simply using something that yields undefined behavior, thus the para is
meaningless?

> vector<T>::operator[] is a function returning an lvalue of T. The &
> operator will make it a pointer to T. So the identity mapping is merely
> making a statement about how two pointers obtained through this function
> are related. And the only way they can be related that way is if those
> pointers point to elements of an array.

Not necessarily. Remember the array-check implementation I presented? One
that relaxes the rules to the operations of std::vector in that para still
seems perfectly legal. Including aborting the program on calculating the
past-the-end pointer, or using operator- .....

>
> vector<> is already constrained by iterator invalidation rules and
> complexity guarantees so that it can only (realistically) be implemented
> by having it maintain an array privately, with its [] operator returning
> references to objects inside that private array.

The fun is I would bet quite a lot most vector implementations do not hold
internally an array, they create object by object using the allocators
construct-function. No array.

>
> > Fulfilling only the limited operations guaranteed by that para seems
> > to be enough for me. An implementation that terminates on calculating
the
> > past-the-end pointer value seems to be perfectly legal.
>
> It can't terminate and obey the identity. Assume vec is a vector<int>
> with size() > 0.
>
>      int* end = &vec[vec.size()-1] + 1;
>
> is valid because we're incrementing a pointer past an object.

Correct.

By
> applying the identity of 23.2.4 to &vec[], and simplifying:
>
>      int* end = &vec[0] + vec.size();
>
> Which must therefore be valid as well.

I don't think this is automatically guaranteed - even if the para is 100%
legal and not already undefined behavior. Simply because vec.size() need not
be one, and the identity guarantees only apply to real objects. And they
apply only to operator+; nothing says other pointer arithmetic expression
can be used safely. I am pretty convinced explicit relaxing of 5.7/5 would
be needed, which is not there.

>
> Of course, &vec[vec.size()] is still undefined because the identity
> doesn't hold for n >= size().

Yes, and by reading the thread about null-references my opinion based on my
current knowledge (that is, I am not 100% familiar with the details of the
proposals) I hope this will always remain undefined behavior :-)

Anyways, please note that std::vector is only a special case. It is very
special because it is part of the Standard, but non-Standard libraries such
as allocators etc. yield IMHO clearly undefined behavior. That is, all the
manual memory-managements I have ever seen doing some pointer arithmetic are
undefined behavior, unless someone can point me to a para that says
otherwise.

I would even hope someone points me to such para, otherwise some time of
Easter will be spent writing a DR. Searching easter bunnies and eggs would
be more fun :-)

Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Marc Schoolderman <squell@alumina.nl>
Date: Wed, 16 Mar 2005 15:25:17 CST Raw View

Thomas Mang wrote:

[Is a contiguous series of objects an array?]
>>I think I've found something. 1.8/1 says "An object is a region of
>>storage". So, an array object is (using this definition) simply a region
>>of storage containing contiguously allocated sub-objects.
>>But anyway, I think this is a moot issue. The only standard way to get a
>>contiguous series of objects in C++ is by creating an array! Anything
>>else will at least involve a type cast, which is a story in itself.
> I am not following you here. std::vector does, internally, most likely never
> create an array, still the objects are laid out in a contiguous series.

But std::vector is just a class. We are exposed only to its interface,
and the contract on that interface. A std::vector "isn't" a contiguous
series of objects, it provides an abstraction for them.

>>>new (c + i*sizeof(T)) T;
>>>arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c +
> i*sizeof(T));
>>I assume you agree this violates all common sense but are looking for
>>the standard to back you up.
> Honestly, I think this very^very unlikely to hit in practise, but is it
> forbidden?
> Please note I have never said it is a problem in practical code, I am saying
> it is a problem in the current wording of the standard.

Yes, and at the moment, I agree somewhat.

>>The most convincing answer is that your statement is re-using storage of
>>sub-objects of the array, not the storage of the array object proper, so
>>the lifetime of the array itself is not affected.
> No, the array itself counts as one object. If I reuse only one byte of that
> object, or all bytes, does not seem to matter. I am reusing the storage to
> assign it to a newly created object of type T. That ends the lifetime of the
> array, thus all pointer arithmetic into that dead array seems like undefined
> behavior to me.

While the array itself counts as one object, it also contains other
sub-objects (which it is the 'complete object' for).

The exact same situation happens with structs and classes. So if we
re-use the storage of a class member, does that end the lifetime of the
encompassing class as well?

>>Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
>>'non-POD class type'. So the list of undefined operations doesn't apply
>>to arrays and POD objects regardless.
>>However, you've convinced me that 3.8 is confusing at best. I'm unsure
>>what "reusing storage" actually means.
> Probably reusing the storage in a write-matter other than the object type it
> was created with.

I was forgetting the strict aliassing rules of 3.10/15, which forbids
most of the things you could use the 'old' lvalue for anyway.

>>I don't see how a library specification would overrule the language.
> Well, by clearly contradicting another para? Wouldn't be the first diverging
> wording.

But there's a clear seperation in the standard (1.5). I'd really like to
see examples of the library introducing new meaning to the language.

> General question to the language lawyers: Is the wording of 23.2.4/1
> extending the para about pointer arithmetic (by meaning "everything in the
> Standard is correct, so it can't contradict, it must extend"), or is it
> simply using something that yields undefined behavior, thus the para is
> meaningless?

This is a false dichotomy. You're dismissing the possibility of the
identity mapping as saying something about std::vector, not pointer
arithmetic.

>>vector<> is already constrained by iterator invalidation rules and
>>complexity guarantees so that it can only (realistically) be implemented
>>by having it maintain an array privately, with its [] operator returning
>>references to objects inside that private array.
> The fun is I would bet quite a lot most vector implementations do not hold
> internally an array, they create object by object using the allocators
> construct-function. No array.

But how can you construct them object-by-object in a contiguous fashion
unless you already have an array you are constructing them into?

And since 3.8/1 says that the lifetime of [an array] begins as soon as
storage with the proper size and alignment for it is obtained - there's
your array.

>>It can't terminate and obey the identity. Assume vec is a vector<int>
>>with size() > 0.
>>     int* end = &vec[vec.size()-1] + 1;
>>is valid because we're incrementing a pointer past an object.
> Correct.

>>applying the identity of 23.2.4 to &vec[], and simplifying:
>>     int* end = &vec[0] + vec.size();
>>Which must therefore be valid as well.
> I don't think this is automatically guaranteed - even if the para is 100%
> legal and not already undefined behavior.

Well, if you assume that the paragraph is undefined behaviour, then the
first statement is undefined behaviour as well. If we assume the
paragraph is valid, then the second one must be valid as well.

Remember that this example wasn't to show that the identity mapping
itself is valid, it was to show that IF it is, you also can calculate
the past-the-end pointer in the ordinary fashion. You called that into
doubt as well.

> and the identity guarantees only apply to real objects. And they
> apply only to operator+; nothing says other pointer arithmetic expression
> can be used safely. I am pretty convinced explicit relaxing of 5.7/5 would
> be needed, which is not there.

So, in your view, the identity map doesn't say, for exampe,

    &vec[0] == &vec[n] - n;      with 0 <= n < vec.size()

> Anyways, please note that std::vector is only a special case. It is very
> special because it is part of the Standard, but non-Standard libraries such
> as allocators etc. yield IMHO clearly undefined behavior. That is, all the
> manual memory-managements I have ever seen doing some pointer arithmetic are
> undefined behavior, unless someone can point me to a para that says
> otherwise.

But this is C++, I do think it's basically heretical to suggest that the
standard library can do things that other libraries can not.

So I'd spend my easter searching for bunnies, not defects ;)

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "Thomas Mang" <nospam@pop.ucsd.edu>
Date: Sat, 19 Mar 2005 01:29:24 CST Raw View

"Marc Schoolderman" <squell@alumina.nl> schrieb im Newsbeitrag
news:42385F79.6090504@alumina.nl...
> Thomas Mang wrote:
>
> [Is a contiguous series of objects an array?]
> >>I think I've found something. 1.8/1 says "An object is a region of
> >>storage". So, an array object is (using this definition) simply a region
> >>of storage containing contiguously allocated sub-objects.
> >>But anyway, I think this is a moot issue. The only standard way to get a
> >>contiguous series of objects in C++ is by creating an array! Anything
> >>else will at least involve a type cast, which is a story in itself.
> > I am not following you here. std::vector does, internally, most likely
never
> > create an array, still the objects are laid out in a contiguous series.
>
> But std::vector is just a class. We are exposed only to its interface,
> and the contract on that interface. A std::vector "isn't" a contiguous
> series of objects, it provides an abstraction for them.

Yes, that's what I said. However, I am pretty sure the intent of the
internal layout of std::vector was to  make it compatible with old libraries
dealing with pointers into arrays and performing pointer arithmetic.
But the current Standard says it is undefined behavior doing so, because no
array exists.
Note that basic_string<>::c_str() does return a pointer into an array,
because the description explicitly guarantees it.

>
> >>>new (c + i*sizeof(T)) T;
> >>>arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c +
> > i*sizeof(T));
> >>I assume you agree this violates all common sense but are looking for
> >>the standard to back you up.
> > Honestly, I think this very^very unlikely to hit in practise, but is it
> > forbidden?
> > Please note I have never said it is a problem in practical code, I am
saying
> > it is a problem in the current wording of the standard.
>
> Yes, and at the moment, I agree somewhat.
>
> >>The most convincing answer is that your statement is re-using storage of
> >>sub-objects of the array, not the storage of the array object proper, so
> >>the lifetime of the array itself is not affected.
> > No, the array itself counts as one object. If I reuse only one byte of
that
> > object, or all bytes, does not seem to matter. I am reusing the storage
to
> > assign it to a newly created object of type T. That ends the lifetime of
the
> > array, thus all pointer arithmetic into that dead array seems like
undefined
> > behavior to me.
>
> While the array itself counts as one object, it also contains other
> sub-objects (which it is the 'complete object' for).
>
> The exact same situation happens with structs and classes. So if we
> re-use the storage of a class member, does that end the lifetime of the
> encompassing class as well?

I think so, because the still storage belongs to the outermost object.
I fairly certain one cannot override the storage occupied by the first
member of a std::pair<std::vector<int>, long double> and still claim it to
be a std::pair<>.

>
> >>Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
> >>'non-POD class type'. So the list of undefined operations doesn't apply
> >>to arrays and POD objects regardless.
> >>However, you've convinced me that 3.8 is confusing at best. I'm unsure
> >>what "reusing storage" actually means.
> > Probably reusing the storage in a write-matter other than the object
type it
> > was created with.
>
> I was forgetting the strict aliassing rules of 3.10/15, which forbids
> most of the things you could use the 'old' lvalue for anyway.
>
> >>I don't see how a library specification would overrule the language.
> > Well, by clearly contradicting another para? Wouldn't be the first
diverging
> > wording.
>
> But there's a clear seperation in the standard (1.5). I'd really like to
> see examples of the library introducing new meaning to the language.

Here I agree with, at least with vector, and that's what I base my opinion
about a necessary defect report upon:
5.7/5 says pointer arithmetic (other than 0/1) requires an array. Nothing in
std::vector specifies it is an array (although it has internally the same
layout), so the para about the identity guarantees is useless to me, because
the expression used therein invokes undefined behavior.

>
> > General question to the language lawyers: Is the wording of 23.2.4/1
> > extending the para about pointer arithmetic (by meaning "everything in
the
> > Standard is correct, so it can't contradict, it must extend"), or is it
> > simply using something that yields undefined behavior, thus the para is
> > meaningless?
>
> This is a false dichotomy. You're dismissing the possibility of the
> identity mapping as saying something about std::vector, not pointer
> arithmetic.

Hmm. The para does pointer arithmetic, doesn't it? The para is supposed to
give the guarantee of pointer arithmetic other pre-standard era libraries
can count on, isn't it? Unfortunately, there is 5.7/5.

But I admit I have a problem here to understand priorities. The issue is:
One para says undefined behavior, the other says certain operations are
guaranteed. Which one has priority? The safe way (undefined behavior), or
the brave (guaranteed behavior) ?

>
> >>vector<> is already constrained by iterator invalidation rules and
> >>complexity guarantees so that it can only (realistically) be implemented
> >>by having it maintain an array privately, with its [] operator returning
> >>references to objects inside that private array.
> > The fun is I would bet quite a lot most vector implementations do not
hold
> > internally an array, they create object by object using the allocators
> > construct-function. No array.
>
> But how can you construct them object-by-object in a contiguous fashion
> unless you already have an array you are constructing them into?

By allocation raw memory using the allocator, and then creating every object
directly past (that is, sizeof(T) bytes) the previous.
The memory layout is the same as an array, but officially it is not an
array.

> And since 3.8/1 says that the lifetime of [an array] begins as soon as
> storage with the proper size and alignment for it is obtained - there's
> your array.

Here you have found an interesting para. Indeed, I agree, allocating memory
seems to create a char[] (or an int[] or POD[]).
But the problem remains in my opinion; when the  memory is assigned to
another object T, the lifetime ends and the array is gone. It's in the same
para, 4 lines later.

>
> >>It can't terminate and obey the identity. Assume vec is a vector<int>
> >>with size() > 0.
> >>     int* end = &vec[vec.size()-1] + 1;
> >>is valid because we're incrementing a pointer past an object.
> > Correct.
>
> >>applying the identity of 23.2.4 to &vec[], and simplifying:
> >>     int* end = &vec[0] + vec.size();
> >>Which must therefore be valid as well.
> > I don't think this is automatically guaranteed - even if the para is
100%
> > legal and not already undefined behavior.
>
> Well, if you assume that the paragraph is undefined behaviour, then the
> first statement is undefined behaviour as well. If we assume the
> paragraph is valid, then the second one must be valid as well.

Here I disagree with. There is clearly 5.7/5. If 23.2.4/1 overrules 5.7/5
(repeating myself: IMHO it doesn't because the expression is already
undefined behavior), then in  my opinion _only_ for operator+, and only for
n < vec.size(). Forget for a moment implementation stupidity and imagine one
that has fat pointers storing if they point into a vector-"array" and abort
on another operation than operator+. Is that illegal? Is it illegal if it
terminates on calculation the one-past-end pointer?
Personally, I think you read too much into 23.2.4/1, but of course it's also
possible I am not reading enough into it.

>
> Remember that this example wasn't to show that the identity mapping
> itself is valid, it was to show that IF it is, you also can calculate
> the past-the-end pointer in the ordinary fashion. You called that into
> doubt as well.

Yes, indeed I do. I think I have learnt not to read the C++ Standard in a
way "B is guaranteed, so from that it follows automatically C and D are
guaranteed too". This might, of course, sometimes be the case, but in this
particular case I don't think so.

>
> > and the identity guarantees only apply to real objects. And they
> > apply only to operator+; nothing says other pointer arithmetic
expression
> > can be used safely. I am pretty convinced explicit relaxing of 5.7/5
would
> > be needed, which is not there.
>
> So, in your view, the identity map doesn't say, for exampe,
>
>     &vec[0] == &vec[n] - n;      with 0 <= n < vec.size()

Yes, because that violates 5.7/5. And

T* pastEnd = &vec[0] + vec.size();

is IMHO also undefined behavior.

IOW: Can you come up with a para that clearly guarantes these expressions to
be valid, or forbids an implementation choking on that when used?

>
> > Anyways, please note that std::vector is only a special case. It is very
> > special because it is part of the Standard, but non-Standard libraries
such
> > as allocators etc. yield IMHO clearly undefined behavior. That is, all
the
> > manual memory-managements I have ever seen doing some pointer arithmetic
are
> > undefined behavior, unless someone can point me to a para that says
> > otherwise.
>
> But this is C++, I do think it's basically heretical to suggest that the
> standard library can do things that other libraries can not.

Well, of course it can! Standard Library Simon says, so that's the way it
is:-)

Take a look at any allocator you like optimized on performing better than
::operator new. Those will very probably keep internally a block of raw
memory, and assign the memory to objects on request. They do it by pointer
arithmetic, usually by adding n*sizeof(T) to the starting address of that
block of memory. Here, only 5.7/5 applies, so it is undefined behavior
(because no array exists).

I am fairly certain this was not intended (which would make it - as far as I
know - a defect, not a proposal). Fixing std::vector is more or less only a
side-effect.

Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: rl.news@tempest-sw.com (Ray Lischner)
Date: Tue, 22 Mar 2005 20:41:04 GMT Raw View

On Saturday 19 March 2005 02:29 am, Thomas Mang wrote:

> I am pretty sure the intent of the
> internal layout of std::vector was to=C2=A0=C2=A0make=C2=A0it=C2=A0comp=
atible=C2=A0with=C2=A0old
> libraries dealing with pointers into arrays and performing pointer
> arithmetic. But the current Standard says it is undefined behavior
> doing so, because no array exists.

You are correct about the intent, but not about the standard. This
oversight was corrected. The current Standard, that is, ISO/IEC
14882:2003(E), says, "The elements of a vector are stored contiguously,
meaning that if v is a vector<T, Allocator> where T is some type other
than bool, then it obeys the identity &v[n] =3D=3D &v[0] + n for all 0 <=3D=
 n
< v.size()."
--
Ray Lischner, author of C++ in a Nutshell
http://www.tempest-sw.com/cpp

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Wed, 23 Mar 2005 19:47:04 GMT Raw View

Thomas Mang wrote:

[Is a contiguous series of objects an array?]
>>>I am not following you here. std::vector does, internally, most likely
>>>never create an array, still the objects are laid out in a contiguous series.
>>But std::vector is just a class. We are exposed only to its interface,
>>and the contract on that interface. A std::vector "isn't" a contiguous
>>series of objects, it provides an abstraction for them.
> Yes, that's what I said. However, I am pretty sure the intent of the
> internal layout of std::vector was to  make it compatible with old libraries
> dealing with pointers into arrays and performing pointer arithmetic.

Yes, and 'low level' libraries in general.

> But the current Standard says it is undefined behavior doing so, because no
> array exists.

But, we agreed vector is just an abstraction. So the fact that we have a
vector does not exclude the possibility of there being an array.

>>The exact same situation happens with structs and classes. So if we
>>re-use the storage of a class member, does that end the lifetime of the
>>encompassing class as well?
> I think so, because the still storage belongs to the outermost object.
> I fairly certain one cannot override the storage occupied by the first
> member of a std::pair<std::vector<int>, long double> and still claim it to
> be a std::pair<>.

But what if you re-create a std::vector<int> in it's place? You're
allowed to do that if the vector had been a complete object, would it be
disallowed to do this inside a std::pair?

I'm going this route because you see a problem with a vector *formally*
creating an array, because of the object lifetime rules. Do you agree
that if you can re-use the storage of a subobject without ending the
lifetime of the complete object, that objection would fall?

> But I admit I have a problem here to understand priorities. The issue is:
> One para says undefined behavior, the other says certain operations are
> guaranteed. Which one has priority? The safe way (undefined behavior), or
> the brave (guaranteed behavior) ?

I'm repeating myself here, but this is a false dichotomy. There are
alternatives where both statements do not contradict eachother.

>>And since 3.8/1 says that the lifetime of [an array] begins as soon as
>>storage with the proper size and alignment for it is obtained - there's
>>your array.
> Here you have found an interesting para. Indeed, I agree, allocating memory
> seems to create a char[] (or an int[] or POD[]).

Or "non-POD[]"! I think this is very significant, because this implies
that the lifetime of a complete object is different from its subobjects.

After all, the array lifetime starts whenever its storage is obtained,
but the lifetimes of its non-POD subobjects only start as soon as their
individual constructor calls have completed.

>>Well, if you assume that the paragraph is undefined behaviour, then the
>>first statement is undefined behaviour as well. If we assume the
>>paragraph is valid, then the second one must be valid as well.
> Here I disagree with. There is clearly 5.7/5. If 23.2.4/1 overrules 5.7/5
> (repeating myself: IMHO it doesn't because the expression is already
> undefined behavior),

But if the expresion is undefined behaviour, then the first one
(not repeated here) was undefined behaviour as well.

> Personally, I think you read too much into 23.2.4/1, but of course it's also
> possible I am not reading enough into it.

Well, I think the identity mapping is a neat way to formally solve the
issue. If they had explicitly put in wording that &vec[0] points to the
first element of an array, there would be all kinds of loopholes and
questions. How big is the array? Should it contain .capacity() elements?
What effect does .reserve() have on it? And so on.

The trick is you have an identity map which is axiomatically valid, you
can apply it to other statements to ascertain if they are valid.

>>So, in your view, the identity map doesn't say, for exampe,
>>    &vec[0] == &vec[n] - n;      with 0 <= n < vec.size()
> Yes, because that violates 5.7/5. And

And what about (applying the identity)

    &vec[0] == &vec[0] + n - n;     with 0 <= n < vec.size()

>>But this is C++, I do think it's basically heretical to suggest that the
>>standard library can do things that other libraries can not.
> Well, of course it can! Standard Library Simon says, so that's the way it
> is:-)

What section of the Standard is he in? ;)

Anyway, here's the catch. If you assume that the Standard library can
circumvent ordinary language restrictions, then it is also very easy to
assume that it can work some magic and create arrays where you think
that they could ordinarily not be.

This doesn't solve the object lifetime questions (which are more
interesting, IMHO!), but it does provide more than enough room for
23.2.4/1 to be valid, either way.

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: squell@alumina.nl (Marc Schoolderman)
Date: Mon, 28 Feb 2005 22:20:20 GMT Raw View

[Somehow posting to the newsgroup itself doesn't work for me]

Thomas Mang wrote:

> Thanks, missed that. However, incrementing the pointer twice (without
> creating an object) is undefined behavior (even if it points to valid
> memory), and destruction has to appear in the reverse order - otherwise the
> first element will be destroyed first, then the pointer doesn't point to a
> valid object any more. The list of actions what can be done with those
> pointers in 3.8/5 does not include arithmetic operations (although one can
> dereference the pointer), so it's undefined behavior.
> Correct, or did I miss again something?

I believe your assumption that initPointer does not point to a valid
array is incorrect. Consider:

 > void * Mem = ::operator new(sizeof(Test) * arraySize);

3.7.3.1/2 "The pointer returned shall be suitably aligned so that it can
be converted to a pointer of any complete object type and then used to
access the object or array in the storage allocated."

So Mem is properly aligned and when converted by the static_cast<>,
points to (storage for) arraySize objects.

You also seem worried that incrementing a pointer is only valid when the
storage it points to contains a valid object. Intuitively, that implies
pointers get dereferenced when you do math on them, which in turn would
mean using past-the-end values would be illegal (since you can't
dereference those).

3.8/5 seems to codify this - it limits using pointers to objects whose
lifetime has not begun yet. Pointer arithmetic doesn't seem to be
restricted.

~Marc

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Thu, 3 Mar 2005 06:47:11 GMT Raw View

I am throwing in another case:


In the 2003 revision, 23.2.4/1 was modified to guarantee vector elements are
stored contiguously, to make vector compatible with older libraries dealing
with pointers into _arrays_.

As far as I read the Standard, whenever those libraries do pointer
arithmetic other than 0 / 1, the behavior is undefined, because again there
is simply no array. The current wording of the Standard however does pointer
arithmetic; butit talks about identity guarantees, not pointer arithmetic
guarantees.
Does this suffice to make the current wording defined behavior? Or makes it
use of an expression that is undefined behavior?

Of course, I am not questioning the intent of that para, I am questioning
the legality of the expression used therein and very likely within many
libraries. std::vector is simply not an array in the techical sense,
therefore pointer arithmetic (other than 0 / 1) means all bets off.

Unless I am missing something that makes the current situation defined as it
was intended, my impression is pointer arithmetic should not be restricted
to arrays, but to all contiguous serieses of objects of the same type.
Your thoughts?


Thomas


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "msalters" <Michiel.Salters@logicacmg.com>
Date: Thu, 3 Mar 2005 14:56:02 CST Raw View

"Thomas Mang" wrote:
> Just another issue (how the whole idea derived from):
>
>
> Suppose you write a memory pool - one that preallocates some chunk of
memory
> and assigns it to objects if they are created (overloading new).
>
> Then my impression is:
>
> situation 1) I use ::operator new to obtain the memory.
>
> Then indexing into the chunk is not possible, except indeces 0 and 1
(1 if
> there is an element at index 0) because there are no arrays, and
adding more
> than 1 is undefined.
> However, I can access the first free spot by always incrementing a
pointer
> by one - e.g. jumping from object to the next object(all objects
represent a
> "one-element-array", and therefore increment can be used). This
requires all
> objects to be of the same type.

No. You have a valid char* pointer, enough memory, so the only concern
is whether the char* is properly aligned for the type you're creating.
This will always be the case if you're creating only objects of a
single type, but that's because they will be objects with the same
sizeof() and thus the same alignment. Any other type with the same
sizeof() can be substituted.

> Casting to char* etc. and using that for indexing is not possible,
there are
> formally no char-elements / no char-array.

Huh? I've got no idea what you're talking about. You can use char* for
indexing; a char* can hold any address.

> situation 2) I allocate the memory using char[]
>
> Is indexing into that array [sizeof(T) * index] possible? I don't
think so,
> because when the memory is overwritten, the lifetime of the char ends
> (although the bits still are a valid char) - and therefore I don't
have
> elements of an array any more.

Wrong, for a number of reasons. To name a few: lifetime doesn't matter
for
chars, any memory can be accessed by chars, you're not using the chars
but
instead the storage they used to occupy.

> Furthermore, an implementation is allowed to add - for a call to the
> allocation function, an array-overhead to all array-new expressions -
> probably for non-PODs / types with destructors with side effects.
However,
> there is nothing that forbids adding an overhead for a char[]. The
problem
> is if this overhead causes an overflow for std::size_t, bets are off
(the
> allocation function will probably return less memory than needed.
Just tried
> it out, the program crashed nicely, although the memory needed to
hold n
> objects of type T did not exceed
std::numeric_limits<std::size_t>::max() ).

Is this a problem in the real world, where size_t is much larger then
available memory anyway? In theory, it's not an issue because an
implementation can just document the maximum object size such that
new[] won't overflow. (An array counts as one object.)

HTH,
Michiel Salters

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Sun, 27 Feb 2005 22:02:35 GMT Raw View

Greetings,


Consider the following program (stripped down to the relevant parts):

struct Test
{
Test();
~Test();
};

int main()
{
std::size_t arraySize = 10;
void * Mem = ::operator new(sizeof(Test) * arraySize);

Test * initPointer = static_cast<Test*>(Mem);
for (std::size_t i = 0; i < arraySize; ++i)
{
    new (initPointer) Test;
    ++initPointer;               // #1
}
// cleanup
}


Ignore destruction of the Test-objects, exception handling, possible usage
of a raw_storage_iterator etc.

What I am interested in: Is this code undefined behavior?

I think it is because of the ++initPointer expression.
++Pointer increments a pointer by one (Pointer += 1), which means Pointer =
Pointer + 1. For the arithmetic operator+, the Standard says in 5.7/4:
"... If the pointer operand points to an element of an array object".

Here, initPointer does not point to an array object, since there is no
array.

Is  my conclusion correct and this program yields undefined behavior?


Thank you,

Thomas


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net
Date: Sun, 27 Feb 2005 22:16:36 CST Raw View

"Thomas Mang" wrote:
> Greetings,
>
>
> Consider the following program (stripped down to the relevant parts):
>
> struct Test
> {
> Test();
> ~Test();
> };
>
> int main()
> {
> std::size_t arraySize = 10;
> void * Mem = ::operator new(sizeof(Test) * arraySize);
>
> Test * initPointer = static_cast<Test*>(Mem);
> for (std::size_t i = 0; i < arraySize; ++i)
> {
>     new (initPointer) Test;
>     ++initPointer;               // #1
> }
> // cleanup
> }
>
>
> Ignore destruction of the Test-objects, exception handling, possible
usage
> of a raw_storage_iterator etc.
>
> What I am interested in: Is this code undefined behavior?
>
> I think it is because of the ++initPointer expression.
> ++Pointer increments a pointer by one (Pointer += 1), which means
Pointer =
> Pointer + 1. For the arithmetic operator+, the Standard says in
5.7/4:
> "... If the pointer operand points to an element of an array object".
>
> Here, initPointer does not point to an array object, since there is
no
> array.
>
> Is  my conclusion correct and this program yields undefined behavior?

That is not correct. See 5.7p4: "For the purposes of these operators, a
pointer to a non-array object behaves the same as a pointer to the
first element of an array of length one with the type of the object as
it's element type."

It's perfectly legal to increment a pointer to the last element of an
array; the result is called a one-past-the-end pointer. Such a pointer
cannot be safely dereferenced or incremented, but it can be
decremented, compared with other pointers for equality, and it can be
compared with a pointer at the object or one-past-the-end for relative
order.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Mon, 28 Feb 2005 12:53:09 GMT Raw View

<kuyper@wizard.net> schrieb im Newsbeitrag
news:1109561763.636738.327210@o13g2000cwo.googlegroups.com...
>
> "Thomas Mang" wrote:
> > Greetings,
> >
> >
> > Consider the following program (stripped down to the relevant parts):
> >
> > struct Test
> > {
> > Test();
> > ~Test();
> > };
> >
> > int main()
> > {
> > std::size_t arraySize = 10;
> > void * Mem = ::operator new(sizeof(Test) * arraySize);
> >
> > Test * initPointer = static_cast<Test*>(Mem);
> > for (std::size_t i = 0; i < arraySize; ++i)
> > {
> >     new (initPointer) Test;
> >     ++initPointer;               // #1
> > }
> > // cleanup
> > }
> >
> >
> > Ignore destruction of the Test-objects, exception handling, possible
> usage
> > of a raw_storage_iterator etc.
> >
> > What I am interested in: Is this code undefined behavior?
> >
> > I think it is because of the ++initPointer expression.
> > ++Pointer increments a pointer by one (Pointer += 1), which means
> Pointer =
> > Pointer + 1. For the arithmetic operator+, the Standard says in
> 5.7/4:
> > "... If the pointer operand points to an element of an array object".
> >
> > Here, initPointer does not point to an array object, since there is
> no
> > array.
> >
> > Is  my conclusion correct and this program yields undefined behavior?
>
> That is not correct. See 5.7p4: "For the purposes of these operators, a
> pointer to a non-array object behaves the same as a pointer to the
> first element of an array of length one with the type of the object as
> it's element type."


Thanks, missed that. However, incrementing the pointer twice (without
creating an object) is undefined behavior (even if it points to valid
memory), and destruction has to appear in the reverse order - otherwise the
first element will be destroyed first, then the pointer doesn't point to a
valid object any more. The list of actions what can be done with those
pointers in 3.8/5 does not include arithmetic operations (although one can
dereference the pointer), so it's undefined behavior.
Correct, or did I miss again something?

Thomas


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Mon, 28 Feb 2005 12:54:15 GMT Raw View

Just another issue (how the whole idea derived from):


Suppose you write a memory pool - one that preallocates some chunk of memory
and assigns it to objects if they are created (overloading new).

Then my impression is:

situation 1) I use ::operator new to obtain the memory.

Then indexing into the chunk is not possible, except indeces 0 and 1 (1 if
there is an element at index 0) because there are no arrays, and adding more
than 1 is undefined.
However, I can access the first free spot by always incrementing a pointer
by one - e.g. jumping from object to the next object(all objects represent a
"one-element-array", and therefore increment can be used). This requires all
objects to be of the same type.

Casting to char* etc. and using that for indexing is not possible, there are
formally no char-elements / no char-array.


situation 2) I allocate the memory using char[]

Is indexing into that array [sizeof(T) * index] possible? I don't think so,
because when the memory is overwritten, the lifetime of the char ends
(although the bits still are a valid char) - and therefore I don't have
elements of an array any more. At least contiguity is broken, IMO even the
whole array status.

Furthermore, an implementation is allowed to add - for a call to the
allocation function, an array-overhead to all array-new expressions -
probably for non-PODs / types with destructors with side effects. However,
there is nothing that forbids adding an overhead for a char[]. The problem
is if this overhead causes an overflow for std::size_t, bets are off (the
allocation function will probably return less memory than needed. Just tried
it out, the program crashed nicely, although the memory needed to hold n
objects of type T did not exceed std::numeric_limits<std::size_t>::max() ).
I fail to see how to determine the maximum array-overhead:
-) An implementation is not required to document it.
-) it may vary during execution time.

Something like max_align, 2^8 will likely work, but not guaranteed.


Is the travelling-forward-one-object-by-one the only portable solution?


Thomas






---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]