Topic: Definition of valid/invalid iterators (was Defect Report:splicing invalidates iterators)


Author: kanze@gabi-soft.de
Date: 2000/08/07
Raw View
smeyers@aristeia.com (Scott Meyers) writes:

|>  On Tue, 18 Jul 2000 03:28:02 CST, Brian Parker wrote:
|>  > On Sat, 15 Jul 2000 07:01:31 CST, "Bill Wade" <bill.wade@stoner.com=
>
|>  > wrote:
|>  > >You have to be careful about what you read into invalid/valid.  A
|>  > >container class's iterator is valid if it points into some
|>  > >container's [begin,end] range.

|>  > Yes, the key point here being that when one speaks of a valid itera=
tor
|>  > one is implicitly speaking of a valid iterator *into a specified
|>  > container* (it is always well-defined which container the iterator
|>  > currently points into as it is induced by the owning container of t=
he
|>  > element pointed to).

|>  The notion of what it means for an iterator to be "valid" or
|>  "invalid" is a tricky one, in my experience.  Last December, I
|>  started a thread ("Iterator non-invalidation and resized hash
|>  tables") that ultimately led to my current working definition of a
|>  "valid" iterator, which is basically this: a valid iterator points
|>  to some element of a container, and when you dereference that
|>  iterator, you get the value you expect to get. =20

I think more is implied.  Suppose that dereferencing gives you the value
you expect, but ++ causes a core dump.  Or simply restarts at the
beginning of the container.  Consider the iterator for list: deleting an
arbitrary element does not invalidate it.  Suppose I delete the element
immediately before the designated element.  One could argue that the
"expected" results of -- is to designate the element which was seen
before the last ++, but of course, this element is no longer in the
list.  (Which of course means that we have to define what is expected.
I certainly wouldn't expect the iterator to designate an element not in
the list, so expectations based on what was previously seen are probably
not valid.)

Lucky for us that there are no collections in the standard without any
ordering.  Resizing a hash_set, for example, might not "invalidate" the
iterator in a classical sense, but could result in direct iteration
visiting the same element twice, and other elements not at all.  Should
we say that such a resizing invalidates the iterator, or not.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: brianjparker@ozemail.com.au (Brian Parker)
Date: 2000/08/01
Raw View
On Thu, 27 Jul 2000 03:18:46 CST, smeyers@aristeia.com (Scott Meyers)
wrote:

>On Thu, 20 Jul 2000 00:07:04 CST, Brian Parker wrote:
>> might not crash if you dereference it is irrelevant. So, for example,
>> when an element is appended to a vector all iterators are invalidated
>> *by definition*, although it is only if a particular implementation
>
>No, they are invalidated only if capacity() == size() at the time of the
>append operation.  Similarly, if capacity() > size(), insertions into a
>vector invalidate only iterators pointing beyond the insertion point.
>See 23.2.4.3/1.

Oops, you're right of course. In fact, a better (correct) example
would be the second case you describe- an insertion is *defined* to
invalidate all  iterators past the insertion point even though in an
actual implementation the iterators will still point to (different)
elements after the insertion.

Thanks,
Brian Parker
(brianp@research.canon.com.au)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: smeyers@aristeia.com (Scott Meyers)
Date: 2000/07/27
Raw View
On Thu, 20 Jul 2000 00:07:04 CST, Brian Parker wrote:
> might not crash if you dereference it is irrelevant. So, for example,
> when an element is appended to a vector all iterators are invalidated
> *by definition*, although it is only if a particular implementation

No, they are invalidated only if capacity() == size() at the time of the
append operation.  Similarly, if capacity() > size(), insertions into a
vector invalidate only iterators pointing beyond the insertion point.
See 23.2.4.3/1.

Scott

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/07/19
Raw View
"Scott Meyers" <smeyers@aristeia.com> wrote

>   I tried to find a description of what it means to invalidate an
iterator,
>   and I was surprised to find that none of the following sources define
the
>   term: [ deleted ]

24.1/5 describes singular and non-singular iterators.

I believe the standard uses 'valid' as a synonym for non-singular.  In
particular an iterator can remain valid and point to a new container (after
vector::swap) or a new value (after list::sort).

It may be that someone intended that when an iterator is not invalidated by
an outside operation, that it point to the previous location (&*iterator
remains unchanged).  AFAICT it is reasonable to implement the
std::Containers' iterators that way.  However, I don't see that as a
requirement.

Consider deque implemented in terms of a vector of pointers to individual T
objects (deque page size is 1).  Deque::iterator is implemented in terms of
a pointer into the vector.  std::sort(deque<T>) is specialized to only move
pointers, not T objects.  Is this legal?  Existing iterators remain
non-singular.  &*iterator changes.  With my definition (a non-singular
iterator is valid) this implementation is legal.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: brianjparker@ozemail.com.au (Brian Parker)
Date: 2000/07/20
Raw View
On Tue, 18 Jul 2000 21:28:46 CST, smeyers@aristeia.com (Scott Meyers)
wrote:

>On Tue, 18 Jul 2000 03:28:02 CST, Brian Parker wrote:
>> On Sat, 15 Jul 2000 07:01:31 CST, "Bill Wade" <bill.wade@stoner.com>
>> wrote:
>> >You have to be careful about what you read into invalid/valid.  A container
>> >class's iterator is valid if it points into some container's [begin,end]
>> >range.
>>
>> Yes, the key point here being that when one speaks of a valid iterator
>> one is implicitly speaking of a valid iterator *into a specified
>> container* (it is always well-defined which container the iterator
>> currently points into as it is induced by the owning container of the
>> element pointed to).
>
>The notion of what it means for an iterator to be "valid" or "invalid" is a
>tricky one, in my experience.  Last December, I started a thread ("Iterator
>non-invalidation and resized hash tables") that ultimately led to my
>current working definition of a "valid" iterator, which is basically this:
>a valid iterator points to some element of a container, and when you
>dereference that iterator, you get the value you expect to get.

My understanding is as follows- a valid iterator points into a
container- it can either point at a container's element (a
dereferenceable valid iterator) or at a container's past-the-end value
(a non-dereferenceable valid iterator).  An iterator that does not
point into a container (i.e. is invalid) is defined to have a singular
value. Therefore, when an iterator is invalidated, it assumes a
singular value by definition.

(Note that it is a matter of *definition* whether an iterator is
invalidated and hence has a singular value. Whether a particular
implementation leaves an iterator pointing at a block of memory that
might not crash if you dereference it is irrelevant. So, for example,
when an element is appended to a vector all iterators are invalidated
*by definition*, although it is only if a particular implementation
decides to reallocate memory at that moment that the iterators truly
cannot be dereferenced, but this implementation detail is irrelevant-
the iterators are still invalid.)

As you say, the standard's wording is a little poor in this area- here
are the definitions from the SGI STL site, which I believe capture the
original intent of the STL designers.

"
Definitions

An iterator is past-the-end if it points beyond the last element of a
container. Past-the-end values are nonsingular and nondereferenceable.

An iterator is valid if it is dereferenceable or past-the-end.

An iterator i is incrementable if there is a "next" iterator, that is,
if ++i is well-defined. Past-the-end iterators are not incrementable.

An Input Iterator j is reachable from an Input Iterator i if, after
applying operator++ to i a finite number of times, i == j. [1]

The notation [i,j) refers to a range of iterators beginning with i and
up to but not including j.

The range [i,j) is a valid range if both i and j are valid iterators,
and j is reachable from i [2].
"

> This
>definition overlooks the valid iterator corresponding to a container's end,

Yes, your definition really only defines a valid dereferenceable
iterator.

>but my definition is designed to appeal to the intuition of new STL
>programmers.  It's also designed to be consistent with Andrew Koenig's
>12/8/99 posting on the topic, which was terse, but, to me, quite helpful:
>
>  In article <MPG.12b7a174f83ea1b989707@news.teleport.com>,
>  Scott Meyers <smeyers@aristeia.com> wrote:
>
>  >  - I don't understand what it means to invalidate an iterator.  Could it
>  >    mean that my iterator continues to point to the same element it did
>  >    before, but all bets are off about what elements I'll see if I continue
>  >    my traversal of the hash table?
>
>  Yes.  But that doesn't invalidate the iterator -- you can still
>  dereference it and get the same element you did before.
>

There are actually two distinct iterator concepts here which are
distinguished in the SGI STL documentation, but not in the standard
documentation.

(1) The simplest concept is what I call a  trivial container iterator
that is used to simply access an element of a container- it
essentially has only * and = defined.
(this is a slightlty stricter definition than the "trivial iterator"
of the SGI STL documentation in that it must point into a container)

(2) The various sequence iterators defined in the STL are, in fact,
refinements of the trivial container iterator concept and add the
"move to next element" operations ++ et al.

All of the iterators in the STL do double duty- sometimes they are
used simply as trivial iterators e.g. to indicate an element to erase
from a list, and at other times they are used to iterate through a
sequence.

So, given these two distinct concepts, there are actually two ways in
which an iterator can be "invalidated"-
(1) an STL iterator may no longer be able to continue iterating
through a sequence i.e. it can only be used as a trivial iterator to
locate an element. There is no defined term for this form of
"invalidation".
(2) an STL iterator may no longer even be useable as a trivial
iterator i.e. it cannot be dereferenced to get a element and is not an
of-the-end iterator for a given container.  This is the meaning of
"invalid" as used in the standard.

(Note, the reason that trivial iterators aren't explicity used as an
additional concept in the STL is that the containers there don't have
a critical need to distinguish them. Examples of data structures where
pure trivial iterators are needed and distinguished include the
Fibonacci heap in the boost library (although there it is given,
confusingly, the name "pointer") and the item concept used thoughout
the LEDA library. Perhaps your hash table problem would have been
clarified if it had used a separate trivial iterator type?).

>As background on the difficulty of trying to nail down the definition of a
>"valid" iterator, let me repeat part of a posting of mine (from 12/10/99):
>
>  I tried to find a description of what it means to invalidate an iterator,
>  and I was surprised to find that none of the following sources define the
>  term:
>
>    Stroustrup/3E (looked in the index under "invalidated" and "iterator,
>                   invalidated") (*)
>    Lippman/Lajoie (ditto)
>    Josuttis (ditto)
>    The ISO Standard (used full-text search for "invalidate")
>
>  The only definition I found was in Austern, but I can't say I agree that
>  the first five chapters do a "good job" of nailing down these terms.
>
>I've since discussed with several people how the notion of an "invalidated"
>iterator is not terribly well defined, yet it's crucial to effective use of
>the STL.  It has been suggested that the lack of a definition in the
>standard is worthy of a defect report.  Comments?

Yes, the standard does not define the term "valid" as applied to
iterators- it only defines valid ranges- which is an unfortunate
omission. The SGI STL definition quoted above-
 "An iterator is valid if it is dereferenceable or past-the-end."
needs to be added to 24.1.

Also, a quick search of the standard reveals that "valid" is sometimes
used where "valid dereferenceable" should be used. e.g.

21.3.5.5/5 should be "Requires: p is a valid dereferenceable iterator
on *this".

and 20.4.4 [lib.specialized.algorithms] should probably have
"no exceptions are thrown from increment, assignment, comparison, or
dereference of valid iterators"

changed to

"no exceptions are thrown from valid increment, assignment,
comparison, or dereference operations on  iterators"
(or something else?)

>
>Scott
>
>(*) Bjarne later posted suggesting that I should have looked under
>"iterator, valid" or "valid iterator", but I've since looked there and
>elsewhere in that book, and I continue to believe that there isn't a decent
>definition of what it means to "invalidate" an iterator in his book.  I
>don't mean to single his book out here, because I haven't found a decent
>definition in any book.  I think this is one of those terms that everybody
>working on standardization ultimately came to understand, but somehow it
>never really got written down anywhere.
>
>
>
>---
>[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
>[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
>[              --- Please see the FAQ before posting. ---               ]
>[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]
>

,Brian Parker
(brianp@research.canon.com.au)


======================================= MODERATOR'S COMMENT:
 Please don't quote moderation banners when posting to comp.std.c++.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: brianjparker@ozemail.com.au (Brian Parker)
Date: 2000/07/20
Raw View
On Wed, 19 Jul 2000 15:15:07 CST, "Bill Wade" <bill.wade@stoner.com>
wrote:

>
>"Scott Meyers" <smeyers@aristeia.com> wrote
>
>>   I tried to find a description of what it means to invalidate an
>iterator,
>>   and I was surprised to find that none of the following sources define
>the
>>   term: [ deleted ]
>
>24.1/5 describes singular and non-singular iterators.
>
>I believe the standard uses 'valid' as a synonym for non-singular.  In
>particular an iterator can remain valid and point to a new container (after
>vector::swap) or a new value (after list::sort).

I agree with this- "valid" and "nonsingular" are synonyms, and
iterators can certainly change containers after particular operations
with that semantics (e.g. splice) and still remain valid.

But in the final analysis, an iterator is valid after an operation if
and only if the standard defines it as such.

And the guiding principal for the standard should be that iterators
should only be mandated as being invalidated if the implementation
requires it, or there are semantic reasons for doing so. For example,
after an insertion into a vector, all iterators *must* be invalidated
as the only feasible implemenation of vector may reallocate memory
during an insertion. However, list.splice is a case where the
implementation allows, and the semantics demand, that iterators remain
valid, but unfortunately the standard gratuitously invalidates the
iterators (and vector.swap is a case where the implementation
potentially allows iterators to remain valid, but the semantics
requires, in my opinion, that they be invalidated).

>
>It may be that someone intended that when an iterator is not invalidated by
>an outside operation, that it point to the previous location (&*iterator
>remains unchanged).  AFAICT it is reasonable to implement the
>std::Containers' iterators that way.  However, I don't see that as a
>requirement.
>
>Consider deque implemented in terms of a vector of pointers to individual T
>objects (deque page size is 1).  Deque::iterator is implemented in terms of
>a pointer into the vector.  std::sort(deque<T>) is specialized to only move
>pointers, not T objects.  Is this legal?  Existing iterators remain
>non-singular.  &*iterator changes.  With my definition (a non-singular
>iterator is valid) this implementation is legal.

Yes,  I can't at the moment think of any reason this would be illegal.

,Brian Parker
(brianp@research.canon.com.au)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: smeyers@aristeia.com (Scott Meyers)
Date: 2000/07/18
Raw View
On Tue, 18 Jul 2000 03:28:02 CST, Brian Parker wrote:
> On Sat, 15 Jul 2000 07:01:31 CST, "Bill Wade" <bill.wade@stoner.com>
> wrote:
> >You have to be careful about what you read into invalid/valid.  A container
> >class's iterator is valid if it points into some container's [begin,end]
> >range.
>
> Yes, the key point here being that when one speaks of a valid iterator
> one is implicitly speaking of a valid iterator *into a specified
> container* (it is always well-defined which container the iterator
> currently points into as it is induced by the owning container of the
> element pointed to).

The notion of what it means for an iterator to be "valid" or "invalid" is a
tricky one, in my experience.  Last December, I started a thread ("Iterator
non-invalidation and resized hash tables") that ultimately led to my
current working definition of a "valid" iterator, which is basically this:
a valid iterator points to some element of a container, and when you
dereference that iterator, you get the value you expect to get.  This
definition overlooks the valid iterator corresponding to a container's end,
but my definition is designed to appeal to the intuition of new STL
programmers.  It's also designed to be consistent with Andrew Koenig's
12/8/99 posting on the topic, which was terse, but, to me, quite helpful:

  In article <MPG.12b7a174f83ea1b989707@news.teleport.com>,
  Scott Meyers <smeyers@aristeia.com> wrote:

  >  - I don't understand what it means to invalidate an iterator.  Could it
  >    mean that my iterator continues to point to the same element it did
  >    before, but all bets are off about what elements I'll see if I continue
  >    my traversal of the hash table?

  Yes.  But that doesn't invalidate the iterator -- you can still
  dereference it and get the same element you did before.

As background on the difficulty of trying to nail down the definition of a
"valid" iterator, let me repeat part of a posting of mine (from 12/10/99):

  I tried to find a description of what it means to invalidate an iterator,
  and I was surprised to find that none of the following sources define the
  term:

    Stroustrup/3E (looked in the index under "invalidated" and "iterator,
                   invalidated") (*)
    Lippman/Lajoie (ditto)
    Josuttis (ditto)
    The ISO Standard (used full-text search for "invalidate")

  The only definition I found was in Austern, but I can't say I agree that
  the first five chapters do a "good job" of nailing down these terms.

I've since discussed with several people how the notion of an "invalidated"
iterator is not terribly well defined, yet it's crucial to effective use of
the STL.  It has been suggested that the lack of a definition in the
standard is worthy of a defect report.  Comments?

Scott

(*) Bjarne later posted suggesting that I should have looked under
"iterator, valid" or "valid iterator", but I've since looked there and
elsewhere in that book, and I continue to believe that there isn't a decent
definition of what it means to "invalidate" an iterator in his book.  I
don't mean to single his book out here, because I haven't found a decent
definition in any book.  I think this is one of those terms that everybody
working on standardization ultimately came to understand, but somehow it
never really got written down anywhere.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]