Topic: Container swapping and iterator invalidation
Author: Greg Falcon <veloso@verylowsodium.com>
Date: Thu, 3 Sep 2009 13:10:39 CST Raw View
I have a question about the rationale of the C++ spec with regards to
container swapping and iterator invalidation. It turns out the spec
guarantees very slightly less than what I needed, and I'm a bit
surprised by it all.
Here's a simplified example demonstrating the problem I hit. The
following template class that provides an unusual interface for
iterating over standard containers. In particular, consider its
swap() method:
template <typename Container>
class Walker : boost::noncopyable {
Container store;
typename Container::const_iterator position;
public:
typedef typename Container::value_type value_type;
Walker(const Container& c = Container())
: store(c), position(store.begin()) {}
void swap(Walker& rhs) {
store.swap(rhs.store);
std::swap(position, rhs.position);
}
bool at_end() const { return position == store.end(); }
const value_type& next() { return *(position++); }
};
I am swapping two STL containers, and iterators into them, and
blithely assuming things are going to work. I had carelessly written
swap() without even considering this issue, but testing didn't reveal
any problems. I only looked closely after the occasional crashes
revealed themselves.
To be fair, I didn't know about the following clause in C++03 when I
wrote the offending code. But at first glance, it appears to condone
my
behavior:
23.1 [lib.container.requirements] paragraph 10:
no swap() function invalidates any references, pointers, or
iterators referring to the elements of the containers being swapped.
The problem here is when position == store.end(). In that case, the
iterator 'position' does not refer to an element in 'store', and so
I'm not covered by 23.1. The above clause does not protect
past-the-end iterators into a swap()ped container.
Now I would have written this off as a mere oversight in the spec --
surely end() iterators should be covered, too -- but it turns out not
to be so, as this corner case was the cause of the crash. I can
expose it easily enough in g++:
int main() {
Walker<std::set<int> > a, b;
assert(a.at_end() && b.at_end()); // okay
a.swap(b);
assert(a.at_end() && b.at_end()); // ASSERTION FAILURE
}
Now, knowing this one corner case, I can fix swap() to work around it,
but my attempt requires some boilerplate whose motivation will be
completely lost on a reader not attuned to this issue.
So clearly, g++ is complying with the spec here, and the problem here
is due to my own programming error. Furthermore, looking at the
libg++ red-black tree implementation, I understand why end() iterators
don't survive a swap(). But I can't help but feel that this is a very
nasty trap.
Why does the standard not require past-the-end iterators to survive a
swap as well? I'm having trouble imagining this extra requirement
imposing much, if any, of a size or performance burden on containers.
Is there something I'm missing, or is this just an unfortunate
oversight?
Greg F
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Author: David Abrahams <dave@boostpro.com>
Date: Fri, 4 Sep 2009 05:18:41 CST Raw View
on Thu Sep 03 2009, Greg Falcon <veloso-AT-verylowsodium.com> wrote:
> Why does the standard not require past-the-end iterators to survive a
> swap as well? I'm having trouble imagining this extra requirement
> imposing much, if any, of a size or performance burden on containers.
> Is there something I'm missing, or is this just an unfortunate
> oversight?
IIRC it /is/ because of combined size and performance considerations.
The cost of an empty container is hugely important in some domains.
Let's take std::list for example. A typical implementation looks like
this:
+------------------+
| list<T> |
+==================+
| |
| +------------+ | +------------+ +------------+
| |node_base<T>|<----+ +-->| node<T> |--+ +-->| node<T> |<----+
| +============+ | | | +============+ | | +============+ |
| | next o----------+ | next o-------+ | next o----+ |
| +------------+ | | +------------+ | +------------+ | |
| | o prev | | +--------o prev | +--------o prev | | |
| +-|----------+ | +------------+ +------------+ | |
| | ^ | | T | | T | | |
+----|---|---------+ +------------+ +------------+ | |
| | | |
| +----------------------------------------------------------+ |
| |
+-----------------------------------------------------------------+
The past-the-end iterator points to the "partial node" inside the list
object. The rest of the nodes can be swapped with another list, but the
partial node referred to by the end iterator is permanently owned by
this list. To get what you want, the partial node would have to be
allocated outside the list structure.
The most obvious way to do that would be to have the list consist
entirely of a pointer to that partial node, which would be separately
allocated (I believe the Dinkumware implementation does it that way).
However, that would require a dynamic memory allocation even for an
empty list, which is unacceptable for some applications.
Why can't you just use a NULL pointer for an empty list and store NULL
in the iterators returned by begin() and end()? Well, then push_back on
an empty list would *have to* create a new partial node outside the
list, and that would invalidate the end() iterator, which isn't allowed.
Why can't you use NULL for a universal list end() iterator? There can
be no universal list end iterator because the end() iterator has to be
able to be decremented. Which list would it then refer to?
So you could store a pointer to the list inside each iterator, and have
decrement on an end() iterator go to the last node of the list. But now
you've doubled the size of the iterator and made both increment and
decrement a lot more expensive.
So, unless you're willing to dynamically allocate a partial node even
for an empty list (and not everyone is), there are few options with good
performance.
--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Author: CornedBee <wasti.redl@gmx.net>
Date: Fri, 4 Sep 2009 12:43:42 CST Raw View
On Sep 3, 9:10 pm, Greg Falcon <vel...@verylowsodium.com> wrote:
> Why does the standard not require past-the-end iterators to survive a
> swap as well? I'm having trouble imagining this extra requirement
> imposing much, if any, of a size or performance burden on containers.
> Is there something I'm missing, or is this just an unfortunate
> oversight?
The problem is that the end iterator of a linked list must be
decrementable, i.e. it must store a pointer to the last element. For
this reason, you cannot simply use a null pointer as the end iterator.
You have to point to a node.
Because this node must exist even for empty containers, this implies a
heap allocation on default construction of a list, if you want to
allocate this node on the heap. Heap allocations in the default
constructor are very much frowned upon by some people.
The alternative is to make the end node a member of the list itself.
This avoids the heap allocation; however, it means that the end
iterator points into the container, and so a swap of the pointers in
the container doesn't change the container the iterators point to. So
you have the choice between heap allocation in the default
constructor, or a swap that preserves validity of end iterators.
Apparently, the standards committee went for the non-allocating
default constructor.
Sebastian
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Author: Nick Hounsome <nick.hounsome@googlemail.com>
Date: Sat, 5 Sep 2009 11:09:05 CST Raw View
On 3 Sep, 20:10, Greg Falcon <vel...@verylowsodium.com> wrote:
> I have a question about the rationale of the C++ spec with regards to
> container swapping and iterator invalidation. It turns out the spec
> guarantees very slightly less than what I needed, and I'm a bit
> surprised by it all.
>
> Here's a simplified example demonstrating the problem I hit. The
> following template class that provides an unusual interface for
> iterating over standard containers. In particular, consider its
> swap() method:
>
> template <typename Container>
> class Walker : boost::noncopyable {
> Container store;
> typename Container::const_iterator position;
> public:
> typedef typename Container::value_type value_type;
> Walker(const Container& c = Container())
> : store(c), position(store.begin()) {}
> void swap(Walker& rhs) {
> store.swap(rhs.store);
> std::swap(position, rhs.position);
> }
> bool at_end() const { return position == store.end(); }
> const value_type& next() { return *(position++); }
> };
>
> I am swapping two STL containers, and iterators into them, and
> blithely assuming things are going to work. I had carelessly written
> swap() without even considering this issue, but testing didn't reveal
> any problems. I only looked closely after the occasional crashes
> revealed themselves.
>
> To be fair, I didn't know about the following clause in C++03 when I
> wrote the offending code. But at first glance, it appears to condone
> my
> behavior:
>
> 23.1 [lib.container.requirements] paragraph 10:
> no swap() function invalidates any references, pointers, or
> iterators referring to the elements of the containers being swapped.
>
> The problem here is when position == store.end(). In that case, the
> iterator 'position' does not refer to an element in 'store', and so
> I'm not covered by 23.1. The above clause does not protect
> past-the-end iterators into a swap()ped container.
>
> Now I would have written this off as a mere oversight in the spec --
> surely end() iterators should be covered, too -- but it turns out not
> to be so, as this corner case was the cause of the crash. I can
> expose it easily enough in g++:
>
> int main() {
> Walker<std::set<int> > a, b;
> assert(a.at_end() && b.at_end()); // okay
> a.swap(b);
> assert(a.at_end() && b.at_end()); // ASSERTION FAILURE
> }
>
> Now, knowing this one corner case, I can fix swap() to work around it,
> but my attempt requires some boilerplate whose motivation will be
> completely lost on a reader not attuned to this issue.
>
> So clearly, g++ is complying with the spec here, and the problem here
> is due to my own programming error. Furthermore, looking at the
> libg++ red-black tree implementation, I understand why end() iterators
> don't survive a swap(). But I can't help but feel that this is a very
> nasty trap.
>
> Why does the standard not require past-the-end iterators to survive a
> swap as well? I'm having trouble imagining this extra requirement
> imposing much, if any, of a size or performance burden on containers.
> Is there something I'm missing, or is this just an unfortunate
> oversight?
>
> Greg F
When I first read this I was surprised - it's definitely a gotcha -
but on reflection
IMHO the "fault" is in requiring ANY iterator to be valid after a swap
of containers.
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]