Topic: Just what /can/ you do with the start of a vector?


Author: pedwards@dmapub.dma.org (Phil Edwards)
Date: 2000/08/25
Raw View
[23.2.4]/1 talks about the complexity requirements of insertions/deletions
at the end of a vector.  Other sequences talk about changes at "the beginning
and end of a <foo>."  Nothing is mentioned about the start of a vector.

The next paragraph goes on to mention that {push,pop}_front are not
supported for vectors.  Okay.

But I can find nothing in the text that would disallow, for example,

    v.erase(v.begin());

which produces no warnings and no errors but occasionally corrupts memory
during the copying of elements.  After reading the first two paragraphs of
[23.2.4] over and over, and comparing the text to the other sequences, the
truth finally dawned on me.  I changed the container type to deque and the
corruption stopped.  (Tested with a few different platform/compiler/library
combinations.)

So... we can insert and delete elements from all but the first position?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Victor Bazarov" <vAbazarov@dAnai.com>
Date: 2000/08/25
Raw View
"Phil Edwards" <pedwards@dmapub.dma.org> wrote...
>
> [23.2.4]/1 talks about the complexity requirements of
insertions/deletions
> at the end of a vector.  Other sequences talk about changes at "the
beginning
> and end of a <foo>."  Nothing is mentioned about the start of a
vector.
>
> The next paragraph goes on to mention that {push,pop}_front are not
> supported for vectors.  Okay.
>
> But I can find nothing in the text that would disallow, for example,
>
>     v.erase(v.begin());
>
> which produces no warnings and no errors but occasionally corrupts
memory
> during the copying of elements.  After reading the first two
paragraphs of
> [23.2.4] over and over, and comparing the text to the other sequences,
the
> truth finally dawned on me.  I changed the container type to deque and
the
> corruption stopped.  (Tested with a few different
platform/compiler/library
> combinations.)
>
> So... we can insert and delete elements from all but the first
position?

There is a difference between the deque<> and the vector<> that I
learned
recently: an erase operation on deque's either end invalidates only the
iterators to the erased elements, an erase operation on vector's either
end
invalidates ALL iterators and references.

I am not sure whether that's going to help, but could it be a hint in
your
investigation?

Victor
--
Please remove capital A's from my address when replying by mail



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Tom <the_wid@my-deja.com>
Date: 2000/08/25
Raw View
In article <8o3phe$alq$1@dmapub.dma.org>,
  pedwards@dmapub.dma.org (Phil Edwards) wrote:
>
> [23.2.4]/1 talks about the complexity requirements of
insertions/deletions
> at the end of a vector.  Other sequences talk about changes at "the
beginning
> and end of a <foo>."  Nothing is mentioned about the start of a
vector.
>
> The next paragraph goes on to mention that {push,pop}_front are not
> supported for vectors.  Okay.
>
> But I can find nothing in the text that would disallow, for example,
>
>     v.erase(v.begin());

There is nothing to disallow that.

>
> which produces no warnings and no errors but occasionally corrupts
memory
> during the copying of elements.

Sounds like a bug in your code or your library implementation.

v.erase(v.begin()); should simply assign all elements to the previous
one in the sequence, and destruct the last one. It shouldn't cause a
realloc, but all iterators will be invalidated of course.


After reading the first two paragraphs of
> [23.2.4] over and over, and comparing the text to the other
sequences, the
> truth finally dawned on me.  I changed the container type to deque
and the
> corruption stopped.  (Tested with a few different
platform/compiler/library
> combinations.)

I'm surprised that they all exhibited this problem. Perhaps it's in the
SGI implementation. I can't believe such a bug could have survived this
long though.

>
> So... we can insert and delete elements from all but the first
position?

We can erase any valid iterator. We can insert from begin() to end()
inclusive. How long these operations take vary from container to
container and position to position.

Tom


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: pedwards@dmapub.dma.org (Phil Edwards)
Date: 2000/08/26
Raw View
Victor Bazarov <vAbazarov@dAnai.com> wrote:
+ There is a difference between the deque<> and the vector<> that I
+ learned
+ recently: an erase operation on deque's either end invalidates only the
+ iterators to the erased elements, an erase operation on vector's either
+ end
+ invalidates ALL iterators and references.
+
+ I am not sure whether that's going to help, but could it be a hint in
+ your investigation?

I don't /think/so... given a vector 'v' and an associated iterator 'i',
the code in question is essentially:

   while ((v is not empty) && (some other stuff not related to the vector))
   {
       i = v.begin();
       ...
       ... a bunch of stuff with *i
       ...
       if (something)
       {
           v.erase(i);
       }
   }

So, each time through the loop, i is always referring to the first element.
And nothing accesses v after the possible call to erase().  Valid code as
far as I can tell.


Phil

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/08/26
Raw View
Tom wrote:
>
> In article <8o3phe$alq$1@dmapub.dma.org>,
>   pedwards@dmapub.dma.org (Phil Edwards) wrote:
> >
> > [23.2.4]/1 talks about the complexity requirements of
> insertions/deletions
> > at the end of a vector.  Other sequences talk about changes at "the
> beginning
> > and end of a <foo>."  Nothing is mentioned about the start of a
> vector.
...
> After reading the first two paragraphs of
> > [23.2.4] over and over, and comparing the text to the other
> sequences, the
> > truth finally dawned on me.  I changed the container type to deque
> and the

Phil: what was the 'truth' that dawned on you?

> > corruption stopped.  (Tested with a few different
> platform/compiler/library
> > combinations.)
>
> I'm surprised that they all exhibited this problem. Perhaps it's in the
> SGI implementation. I can't believe such a bug could have survived this
> long though.

I just ran a test on an SGI platform, and had no problems. Looking back
at Phil's message, I'm not sure why you singled out SGI - he made no
mention of that particular company.

Maybe Phil can provide some example code that fails?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: pedwards@dmapub.dma.org (Phil Edwards)
Date: 2000/08/28
Raw View
James Kuyper  <kuyper@wizard.net> wrote:
+ Tom wrote:
+ > In article <8o3phe$alq$1@dmapub.dma.org>,
+ >   pedwards@dmapub.dma.org (Phil Edwards) wrote:
+ > >
+ > > [23.2.4]/1 talks about the complexity requirements of
+ > insertions/deletions
+ > > at the end of a vector.  Other sequences talk about changes at "the
+ > beginning
+ > > and end of a <foo>."  Nothing is mentioned about the start of a
+ > vector.
+ ...
+ > After reading the first two paragraphs of
+ > > [23.2.4] over and over, and comparing the text to the other
+ > sequences, the
+ > > truth finally dawned on me.  I changed the container type to deque
+ > and the
+
+ Phil: what was the 'truth' that dawned on you?

That in this situation, I shouldn't have been using vector in the first
place.


+ > > corruption stopped.  (Tested with a few different
+ > platform/compiler/library
+ > > combinations.)
+ >
+ > I'm surprised that they all exhibited this problem. Perhaps it's in the
+ > SGI implementation. I can't believe such a bug could have survived this
+ > long though.
+
+ I just ran a test on an SGI platform, and had no problems. Looking back
+ at Phil's message, I'm not sure why you singled out SGI - he made no
+ mention of that particular company.

Well, to be honest, SGI's implementation is used in a lot of different
compilers' standard libraries, especially the "original" STL parts (e.g.,
sequences/iterators/algorithms, if nothing else).  The vector code in use
was, I believe, either SGI's or based directly on it.


+ Maybe Phil can provide some example code that fails?

Not easily.

I'll try to keep this short; I don't want to put the audience to sleep.

The pseudocode loop I posted in another branch of this thread is all there
is of the code that removes things from the vector.  The difficult part --
the thing that initially made me assume it's a bug in my code -- is that
it doesn't always occur.  Part of the looping/deletion conditions in the
pseudocode involves a stochastic process.

We control the seeding, however, so during testing and debugging of this
bug, it was repeatable every time.  It's just that not every seed generates
a group of random numbers which will trigger the bug.  Now you know why
this bothers me so much.  :-)

However, the things that make me think it's a minor bug somewhere in the
implementation are:

  1) The life of the sequence container is this:
        - create with default ctor
        - push_back a number of elements
        - run the loop I posted, which only sometimes calls erase, and
          only on an iterator pointing to the first element
        - when the container is empty, stop
     So, there's no insertions, no sorting, no references into the vector,
     and the only iterator is the one to the beginning.  The only place we
     call erase is the one place that vector isn't specifically optimized
     for.

     The nastiness (described further below) happens a few dozen times out of
     a couple hundred.  It's always the same dozen times, always repeatable.

     After changing from a vector to a deque -- with no other change --
     the nastiness went away.  Over three hundred thousand unique runs
     later, it hadn't crashed once.  Change it back to a vector, and the
     nastiness begins again.

  2) The nature of the nastiness is that the bookkeeping information used
     by malloc/free gets corrupted.  We don't do any major work with free
     store in the body of the loop; in fact, we do very little work with
     raw pointers anywhere in the program.

     The vector member functions occasionally call malloc, which calls
     its own functions.  About five levels down inside malloc is where
     the code would crash.  A little research pointed out that when code
     crashes inside that function, it's either a buggy malloc (doubtful
     in this case) or some user code somewhere else has scribbled on
     the bookkeeping structures at /some/ previous point in the program.
     Reading those structures later on causes malloc's support functions
     to crap out.

     Using a malloc library with memory-flagging features, running the
     program again caused a core dump at the location when those structures
     get overwritten, earlier in the program.[*]  It was one of vector's
     internal helper member functions, during a removal at the front of
     the vector.

Here it was that I read through the intro paragraphs of each sequence's
description in the standard and noticed that vector isn't optimized for
work at the beginning of the vector.  So my paranoid self thought that
either A) there's a bug in vector, or B) vector isn't supposed to handle
this anyhow and it's my fault.  Thus I switched to deque.

By now we were running way behind -- seemingly random crashes inside malloc
take a long time to debug -- and so I didn't investigate any further; I just
stuck with deque and finished the project.  (Since this is turning out to be
weird, I'll try to find time to go back and step through the guts of vector.)

I'd been assuming that using vector in this manner was a user error that
just isn't well described in the current wording of [23.2.4]/1, and that
our code was misusing vector.

After re-reading many times, I didn't know what to think, so I did what
every programmer does when he's not thinking -- I posted to Usenet.  :-)


Phil

[*] Gotta hand it to those Solaris engineers; they came through in a
    big way with watchmalloc.so.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 2000/08/28
Raw View
Tom wrote:
>
> In article <8o3phe$alq$1@dmapub.dma.org>,
>   pedwards@dmapub.dma.org (Phil Edwards) wrote:
> >
> > [23.2.4]/1 talks about the complexity requirements of
> insertions/deletions
> > at the end of a vector.  Other sequences talk about changes at "the
> beginning
> > and end of a <foo>."  Nothing is mentioned about the start of a
> vector.
> >
> > The next paragraph goes on to mention that {push,pop}_front are not
> > supported for vectors.  Okay.
> >
> > But I can find nothing in the text that would disallow, for example,
> >
> >     v.erase(v.begin());
>
> There is nothing to disallow that.
>
> >
> > which produces no warnings and no errors but occasionally corrupts
> memory
> > during the copying of elements.
>
> Sounds like a bug in your code or your library implementation.
>
> v.erase(v.begin()); should simply assign all elements to the previous
> one in the sequence, and destruct the last one. It shouldn't cause a
> realloc, but all iterators will be invalidated of course.
>
> After reading the first two paragraphs of
> > [23.2.4] over and over, and comparing the text to the other
> sequences, the
> > truth finally dawned on me.  I changed the container type to deque
> and the
> > corruption stopped.  (Tested with a few different
> platform/compiler/library
> > combinations.)
>
> I'm surprised that they all exhibited this problem. Perhaps it's in the
> SGI implementation. I can't believe such a bug could have survived this
> long though.

I'd first check the assignment operator of the contained class.
Maybe there's a bug in there. Since vector calls it on erasing
the first vector, and deque doesn't, this could explain the
different behaviour between both containers.

Another thing is to check if the iterator was accidentally
decremented (f.ex. by writing *i--, where (*i)-- was intended).
This would explain that the problem only occurs at begin().

[...]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Ken Bloom" <ken195@bigfoot.com>
Date: 2000/08/28
Raw View
"Phil Edwards" <pedwards@dmapub.dma.org> wrote in message
news:8o3phe$alq$1@dmapub.dma.org...
>
> [23.2.4]/1 talks about the complexity requirements of insertions/deletions
> at the end of a vector.  Other sequences talk about changes at "the
beginning
> and end of a <foo>."  Nothing is mentioned about the start of a vector.
>
> The next paragraph goes on to mention that {push,pop}_front are not
> supported for vectors.  Okay.
>
> But I can find nothing in the text that would disallow, for example,
>
>     v.erase(v.begin());
>
> which produces no warnings and no errors but occasionally corrupts memory
> during the copying of elements.  After reading the first two paragraphs of
> [23.2.4] over and over, and comparing the text to the other sequences, the
> truth finally dawned on me.  I changed the container type to deque and the
> corruption stopped.  (Tested with a few different
platform/compiler/library
> combinations.)
>
> So... we can insert and delete elements from all but the first position?

For the std::deque<>, push_back(), pop_back(), push_front(), pop_front() are
all implemented, meaning they can be implemented in constant time. The
memory address of the first and last element can vary. That is:
&*deq.begin() can have different values before and after a push_front() or a
pop_front();
** So both front and back are special for a std::deque<>.

For the std::vector, the front of the vector only moves when a complete
re-allocation occurs in linear time. Only a push_back() and pop_back() are
implemented because push_front() and pop_front() cannot be implemented in
constant time.
** So only back is special for a std::vector<>

For a std::list, inserts and deletes are constant everywhere.
** So either everything's special or nothing's special, depending on your
viewpoint, on a std::list<>.

The standard allows inserts and deletes at the beginning of a std::vector<>,
but they will be in linear time, so they are not mentioned as a special
case. Your memory corruption is a bug of some kind.


--
Ken Bloom

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS/M/AT/U d- s++:--- a--- C++ UL P++ L+ E----
W+++ N++ ?o ?K w++ !O M++>$ V- PS PE Y-- PGP- t+
5 X++ R--- tv-- b++ DI+ D-- G e- !h r--@ y?
------END GEEK CODE BLOCK------



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/08/28
Raw View
Phil Edwards wrote:
...
> I don't /think/so... given a vector 'v' and an associated iterator 'i',
> the code in question is essentially:
>
>    while ((v is not empty) && (some other stuff not related to the vector))
>    {
>        i = v.begin();
>        ...
>        ... a bunch of stuff with *i
>        ...
>        if (something)
>        {
>            v.erase(i);
>        }
>    }
>
> So, each time through the loop, i is always referring to the first element.
> And nothing accesses v after the possible call to erase().  Valid code as
> far as I can tell.

Does anything in the loop body insert or erase elements of v, directly
or indirectly?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/08/29
Raw View
Phil Edwards wrote:
>
> James Kuyper  <kuyper@wizard.net> wrote:
...
> + Phil: what was the 'truth' that dawned on you?
>
> That in this situation, I shouldn't have been using vector in the first
> place.
...
>      Using a malloc library with memory-flagging features, running the
>      program again caused a core dump at the location when those structures
>      get overwritten, earlier in the program.[*]  It was one of vector's
>      internal helper member functions, during a removal at the front of
>      the vector.
>
> Here it was that I read through the intro paragraphs of each sequence's
> description in the standard and noticed that vector isn't optimized for
> work at the beginning of the vector.  So my paranoid self thought that

That's correct. For your application, deque is more appropriate than
vector. However, from what you've said, both should have worked.
erase(begin()) should be perfectly safe on any non-empty standard
sequence.

The obvious place to start, should you find the time to investigate
further, is to figure out what that helper function is supposed to do,
and try to understand why it's doing something wrong. It's still
possible that the problem is in your code, and that it's only producing
it's first symptom in the library template code. However, if you can
simplify the calling code well enough, you should leave the detailed
investigation to your library vendor. They know that helper function
better than you do (hopefully !).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: pedwards@dmapub.dma.org (Phil Edwards)
Date: Thu, 31 Aug 2000 17:37:19 GMT
Raw View
Replying to two article at once, to (probably) conclude the thread:


Christopher Eltschka  <celtschk@physik.tu-muenchen.de> wrote:
+ I'd first check the assignment operator of the contained class.
+ Maybe there's a bug in there. Since vector calls it on erasing
+ the first vector, and deque doesn't, this could explain the
+ different behaviour between both containers.

The contained class is an enum.  :-(  Good thought, though.


+ Another thing is to check if the iterator was accidentally
+ decremented (f.ex. by writing *i--, where (*i)-- was intended).
+ This would explain that the problem only occurs at begin().

Nope, the iterator doesn't get moved at all during the loop; only reset
to begin() at the start of each iteration.



James Kuyper  <kuyper@wizard.net> wrote:
+ The obvious place to start, should you find the time to investigate
+ further, is to figure out what that helper function is supposed to do,
+ and try to understand why it's doing something wrong. It's still
+ possible that the problem is in your code, and that it's only producing
+ it's first symptom in the library template code.

Could well be.  I am truly hoping that this is the case, although everybody
who has actually looked over the code can't find any problems.


+ However, if you can
+ simplify the calling code well enough, you should leave the detailed
+ investigation to your library vendor. They know that helper function
+ better than you do (hopefully !).

The library vendor in this case is SGI, via g++ 2.95.2.  Also tested with
Sun Workshop 5.0, which also uses SGI's STL.

One of the reasons I'm posting is that in my Copious Spare Time[tm], I hack
on g++'s next-generation replacement for its current C++ library.  We also
use SGI's STL as back-end code for vector.  SGI has released newer versions
of their STL since then, but a quick diff between the two stl_vector.h's show
mainly additional typecasts inside function bodies in the newer release.
(That could be enough to fix any odd bugs that the older vector might
have had, but if not, I'd /really/ like to track down what's causing this.
If it is in fact in the library, then I feel an ethical responsibility to
find it and fix it.)


Thanks to everyone who had suggestions!

Phil

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]