Topic: Stability of multiset and multimap erase()


Author: Frank Compagner <frank@compagner.com>
Date: Sun, 14 Jul 2002 09:55:55 GMT
Raw View
jpotter@falcon.lhup.edu (John Potter) wrote in
news:3d2f3d60.38283406@newstest2.earthlink.net:

> It depends on what you mean by trivial.  Three lines of code seems
> trivial.  One of them is O(lg(N)) worst case, but amortized O(1).
> That is the complexity consideration referenced above.  It may not
> mean much since the erase operation already has those worst and
> amortized behaviors.  However, it is sufficient to declare it a design
> decision and not a defect.

Ok, i see what you mean now. I intended "trivial" to mean (among other
things) plain O(1), but i obviously hadn't taken into account that a
statement such as it++; isn't as straightforward as it first appears, and
has O(lg(N)) worst case. In that case (If you'll allow me to grumble for
the sake of saving some face), the term "complexity consideration" is
slightly misplaced, and should have been "performance consideration". But i
get the point, and it explains the given rationale.

It also removes my concerns that this issue was concerned with alternative
implementations, so it would seem to strengthen the assumption that the
stability requirement is in fact intended, but has not been spelled out. In
that case, this calls for a defect report, which i'll try to submit
shortly.

Frank.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Frank Compagner <fxc@xs4all.nl>
Date: Fri, 12 Jul 2002 18:36:46 GMT
Raw View
jpotter@falcon.lhup.edu (John Potter) wrote in
news:3d25e325.32845140@news.earthlink.net:

> This is a practical answer.  The rebalancing of the non-existent tree
> is done via rotations and does not change the inorder traversal which
> is the sequence presented by the iterators.  A reasonable
> implementation will use the same data structure for all four
> associative containers and will have only one erase(iter) function
> which must be stable.  There is no reordering of elements, there is an
> adjustment of pointers which changes a non-existent tree but not the
> visible sequence.

Ok, so from your comments as well as, for instance, this:

http://groups.google.com/groups?selm=3cac90be%240%2427450%244c41069e%
40reader0.ash.ops.us.uu.net&rnum=1

i gather that the usual way of implementing the containers results in a
stable erase() function, which seems to be the case on the platforms i
tested. This is nice, and means i will probably get away with it for the
forseeable future, but it is not good enough for comp.std.c++.

> The S&L base paper for STL used void.  It was discussed on this
> newsgroup, but returning an iterator was never added to the standard.
> The above is an extension.

P.J.Plauger claims it was present in a draft version, see:

http://groups.google.com/groups?selm=3aae89e8%240%241147440wodc7nh7.
news.uu.net&rnum=3

but that is al rather irrelevant now. One thing that does worry me
however, is LWG issue 130, see:

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-closed.html#130

Which is a remark upon the difference in return type between the sequence
and associative containers erase() member function. It is flagged as not-
a-defect, with the following rationale:
"The LWG believes this was an explicit design decision by Alex Stepanov
driven by complexity considerations.  It has been previously discussed
and reaffirmed, so this is not a defect in the current standard. A future
standard may wish to reconsider this issue."

In the "usual" implementation it should be trivial to return the
iterator, so this seems to suggest that there is at least one valid
alternative implementation with somewhat different behaviour. I can't
easily find a reference to other implementations, but it seems to me
that these could well behave differently with respect to the reordering.

I guess what i'm after is some kind of clarification from the comittee on
this issue. Either there is a stability requirement, in which case it
should be stated clearly, or there isn't, in which case i'd argue that at
least a remark would be in order. I'll go over the arguments one more
time, and then probably submit a defect report.

Frank

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: Fri, 12 Jul 2002 20:57:51 GMT
Raw View
On Fri, 12 Jul 2002 18:36:46 GMT, Frank Compagner <fxc@xs4all.nl>
wrote:

> One thing that does worry me however, is LWG issue 130, see:

> http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-closed.html#130

> Which is a remark upon the difference in return type between the sequence
> and associative containers erase() member function. It is flagged as not-
> a-defect, with the following rationale:
> "The LWG believes this was an explicit design decision by Alex Stepanov
> driven by complexity considerations.  It has been previously discussed
> and reaffirmed, so this is not a defect in the current standard. A future
> standard may wish to reconsider this issue."

> In the "usual" implementation it should be trivial to return the
> iterator, so this seems to suggest that there is at least one valid
> alternative implementation with somewhat different behaviour.

It depends on what you mean by trivial.  Three lines of code seems
trivial.  One of them is O(lg(N)) worst case, but amortized O(1).
That is the complexity consideration referenced above.  It may not
mean much since the erase operation already has those worst and
amortized behaviors.  However, it is sufficient to declare it a design
decision and not a defect.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "Carl Daniel" <cpdaniel@pacbell.net>
Date: Fri, 5 Jul 2002 20:03:27 GMT
Raw View
"Frank Compagner" <frank@compagner.com> wrote in message
news:Xns9242CEF43B1Ffrankcompagnercom@194.109.6.74...
> Hi,
> i'm a bit puzzled about the exact working of the multiset and multimap
> erase() member functions. To provide a good context, let me explain the
> problem that got me thinking about this first:
>
> I have a multimap m, and i would like to erase all elements from the map
> with a certain value. A reasonable approach to do this would seem to be:
>
> for (multimap::iterator i = m.begin(); i != m.end();) {
>     if (i->second == val)
>         m.erase(i++);
>     else
>          ++i;
> }
>

Your discourse about stability aside, why not simply do the erase operation
as the STL creators intended:

-------------------------------------------

// assume appropriate includes & using namespace std

typedef multimap<string,string> MyMap;

MyMap myMap;

// values get into myMap somehow...

pair<MyMap::iterator, MyMap::iterator> range =
myMap.equal_range("myValue");
myMap.erase(range.first,range.second);

---------------------------------------------

.... and let the implementation worry about it :)

-cd


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: Fri, 5 Jul 2002 22:12:56 GMT
Raw View
On Fri,  5 Jul 2002 20:03:27 GMT, "Carl Daniel" <cpdaniel@pacbell.net>
wrote:

> "Frank Compagner" <frank@compagner.com> wrote in message
> news:Xns9242CEF43B1Ffrankcompagnercom@194.109.6.74...

> >     if (i->second == val)
---------------^^^^^^

> Your discourse about stability aside, why not simply do the erase operation
> as the STL creators intended:

Because the STL creators neglected to write erase_if for any container
other than list where it is misspelled remove_if.  Erase_if seems to
be the highest demand missing algorithm.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: cbarron3@ix.netcom.com (Carl Barron)
Date: Sat, 6 Jul 2002 05:06:23 GMT
Raw View
Frank Compagner <frank@compagner.com> wrote:

> Hi,
> i'm a bit puzzled about the exact working of the multiset and multimap
> erase() member functions. To provide a good context, let me explain the
> problem that got me thinking about this first:
>
> I have a multimap m, and i would like to erase all elements from the map
> with a certain value. A reasonable approach to do this would seem to be:
>
> for (multimap::iterator i = m.begin(); i != m.end();) {
>     if (i->second == val)
>         m.erase(i++);
>     else
>          ++i;
> }
>
> While the standard guarantees that the call to erase will not invalidate
> i (see clause 23.1.2/8), i can find no guarantees that the reordering of
> elements that may take place during the erase operation is stable (i.e.
> keeps equivalent elements in the same relative order). If implementations
> are indeed free to reorder equivalent elements during the erase call, the
> code above will not work correctly, as it might skip elements in the
> following way:
>
   The standard says that only iterators 'pointing to' the erase entries
are invalidated.  The iterators can be written to do their advancing and
backing up based on the state of the tree when the operation is
requested, merely by pointing into the actual tree, or writing an
iterator in the probably intenal class defining the tree and its
operations, that lets opertor *() and operator ->() return values refer
to the value_type of the node.  Rebalancing an RB tree only requires
internal tree pointers to change. No movement of value_type is required,
so that even through the node in the tree pointed to by the iteratator's
internal state changed on erase/insert the data stored in the iterator
itself is not changed namely some pointer to the tree node [not moved
but possibly has internal pointers changed] this is akin to

  struct node
  {
        node *left,*right;
        char balance_data;
        value_tyoe      data;
  };

  class iterator
  {
        node *where;
public:
        iterator & operator ++() {where = where->right;return *this;}
        iterator & operator --() {where = where->left;return *this;}
        value_type & operator *() {return (where->data);}
        value_tyoe * operator ->() {return &(where->data);}
        // ...
  }

  so if i, is an iterator of the map then rebalancing will not modify
i.where, but will modify i.where->left and i.where->right,and
i.where->balance_data, thus since the operator's of i work through a
pointer that does not change during rebalancing it is valid after
rebalancing.  This is why such guarantees are possible and able to be
required by the standard.









---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "Ken Alverson" <Ken@Alverson.com>
Date: Sat, 6 Jul 2002 06:22:08 GMT
Raw View
"Frank Compagner" <frank@compagner.com> wrote in message
news:Xns9242CEF43B1Ffrankcompagnercom@194.109.6.74...
>
> I have a multimap m, and i would like to erase all elements from the map
> with a certain value. A reasonable approach to do this would seem to be:
>
> for (multimap::iterator i = m.begin(); i != m.end();) {
>     if (i->second == val)
>         m.erase(i++);
>     else
>          ++i;
> }

Sidestepping your actual question, there are two methods that come to mind
that would not only be guaranteed to work, but would do so more efficiently:

1) multimap::erase(key)
2) multimap::erase(begin,end) combined with lower_/upper_bound (or
equal_range)

Ken


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: Sat, 6 Jul 2002 06:22:17 GMT
Raw View
On Fri,  5 Jul 2002 15:49:41 GMT, Frank Compagner
<frank@compagner.com> wrote:

> While the standard guarantees that the call to erase will not invalidate
> i (see clause 23.1.2/8), i can find no guarantees that the reordering of
> elements that may take place during the erase operation is stable (i.e.
> keeps equivalent elements in the same relative order). If implementations
> are indeed free to reorder equivalent elements during the erase call, the
> code above will not work correctly

This is a practical answer.  The rebalancing of the non-existent tree
is done via rotations and does not change the inorder traversal which
is the sequence presented by the iterators.  A reasonable
implementation will use the same data structure for all four
associative containers and will have only one erase(iter) function
which must be stable.  There is no reordering of elements, there is an
adjustment of pointers which changes a non-existent tree but not the
visible sequence.

> I can see three distinct possibilities here:

> 1. This is by design, as the ability to reorder equivalent elements
> during the erase() call is an important freedom to give to
> implementers.

Nope, no need.

> 2. I'm overlooking something in the standard, and the stability
> requirement is, in fact, in there. In this case, i would appreciate any
> pointers.

It sure is not explicit.

> 3. The stability requirement is implied, but has been left out through
> some oversight.

It may be deducible from the things that do invalidate iterators, but
that is stretching.

> In addition, the Dinkumware lib supplied with Visual C++ is
> non-standard in that the erase() member functions "[...] return an
> iterator that designates the first element remaining beyond any elements
> removed, or end() if no such element exists." From some other comments i
> gather that the iterator return type (similar to list::erase()) was
> present in earlier drafts of the standard, but it apparently got axed
> somewhere along the way.

The S&L base paper for STL used void.  It was discussed on this
newsgroup, but returning an iterator was never added to the standard.
The above is an extension.

> Is there anybody that can shed some light on this issue? Any comments are
> much appreciated.

Just comments, no proof.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Frank Compagner <frank@compagner.com>
Date: Sat, 6 Jul 2002 06:22:38 GMT
Raw View
"Carl Daniel" <cpdaniel@pacbell.net> wrote in
news:DzlV8.1311$cs1.147445889@newssvr14.news.prodigy.com:

> "Frank Compagner" <frank@compagner.com> wrote in message
> news:Xns9242CEF43B1Ffrankcompagnercom@194.109.6.74...

>> I have a multimap m, and i would like to erase all elements from the
>> map with a certain value. A reasonable approach to do this would seem
>> to be:
>>
>> for (multimap::iterator i = m.begin(); i != m.end();) {
>>     if (i->second == val)
>>         m.erase(i++);
>>     else
>>          ++i;
>> }
>>
>
> Your discourse about stability aside, why not simply do the erase
> operation as the STL creators intended:
>
> -------------------------------------------
>
> // assume appropriate includes & using namespace std
>
> typedef multimap<string,string> MyMap;
>
> MyMap myMap;
>
> // values get into myMap somehow...
>
> pair<MyMap::iterator, MyMap::iterator> range =
> myMap.equal_range("myValue");
> myMap.erase(range.first,range.second);
>
> ---------------------------------------------
>
> .... and let the implementation worry about it :)
>
> -cd

Well, for one thing, the original problem was about deleting elements
from a multimap based on the _value_ of the element, not the key. But the
problem is more general than that. What i'm after is an efficient way to
erase elements from a multiset/multimap that satisfy a certain condition.
Your approach works if that condition is equivalence with a given key
value, but not otherwise. Another thing is that this approach requires
iterating over the elements twice; first over (nominally) the entire
container to get the equal_range, and then again over the set of elements
to be erased. It would likely be more efficient, and certainly simpler to
use :

myMap.erase("myValue");

Which achieves the same effect as the code above, but still doesn't solve
my problem. Your comments are appreciated, though.

Frank Compagner

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "Giovanni Bajo" <giovannibajo@REMOVEliberoTHIS.it>
Date: Sat, 6 Jul 2002 06:23:15 GMT
Raw View
"Carl Daniel" <cpdaniel@pacbell.net> ha scritto nel messaggio
news:DzlV8.1311$cs1.147445889@newssvr14.news.prodigy.com...

> Your discourse about stability aside, why not simply do the erase
operation
> as the STL creators intended:

Because he's trying to erase the items whose _value_ matches, not the key. I
don't think there is a good solution for this issue, since [multi]map<>
should be used when you always want to access your element through the key
(they're sorted by key, indexed by key, and so on). The best solution I can
come up with is keeping an external
multimap<value,iterator_within_the_first_map>.

using namespace std;

typedef multimap<key, value> my_map;
typedef multimap<value, my_map::iterator> helper_map;

my_map MyMap;
helper_map HelperMap;

void RemoveByValue(value v)
{
    typedef helper_map::iterator iter;
    pair<iter,iter> range = HelperMap.equal_range(v);

    for (iter i = range.first; i !=range.second;++i)
        MyMap.erase(i->second);

    HelperMap.erase(range.first, range.second);
}

Giovanni Bajo

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Frank Compagner <frank@compagner.com>
Date: Fri, 5 Jul 2002 15:49:41 GMT
Raw View
Hi,
i'm a bit puzzled about the exact working of the multiset and multimap
erase() member functions. To provide a good context, let me explain the
problem that got me thinking about this first:

I have a multimap m, and i would like to erase all elements from the map
with a certain value. A reasonable approach to do this would seem to be:

for (multimap::iterator i = m.begin(); i != m.end();) {
    if (i->second == val)
        m.erase(i++);
    else
         ++i;
}

While the standard guarantees that the call to erase will not invalidate
i (see clause 23.1.2/8), i can find no guarantees that the reordering of
elements that may take place during the erase operation is stable (i.e.
keeps equivalent elements in the same relative order). If implementations
are indeed free to reorder equivalent elements during the erase call, the
code above will not work correctly, as it might skip elements in the
following way:

Suppose m is a multimap<int,int>, containing the following elements
(key,value) in the indicated order: (1,1), (1,2), (1,1).
If we apply the loop above with val = 1, we would like this to result in
a map containing just the element (1,2). However, consider what might
happen during the execution of the first erase(i++) statement. First,
i.operator++ is executed, so i now points to the elment (1,2). Then, the
element that i pointed to, the first (1,1), is erased. If during this
erase call the (equivalent) remaining elements are reordered, so that m
now consists of (1,1), (1,2), the second (1,1) element will be skipped
and fail to be removed.

I can see three distinct possibilities here:
1. This is by design, as the ability to reorder equivalent elements
during the erase() call is an important freedom to give to
implementers. If this is the case, than the above approach simply will
not work, and i will have to resort to constructing an erase list in a
first pass, and erasing the elements from the list in a second, or
possibly something like using remove_copy_if. Both clearly more
cumbersome and less efficient than the approach above. Although i'm not
quite up to speed on the issues the implementers of multimap etc. face,
it still seems strange to me that such a stability requirement would be
hard to meet.
2. I'm overlooking something in the standard, and the stability
requirement is, in fact, in there. In this case, i would appreciate any
pointers.
3. The stability requirement is implied, but has been left out through
some oversight. Seems unlikely to me, but still a possibility. If this is
the case, the requirement should be added to the C++)x proposal.

I have run some simple tests on a couple of implementations (Visual C++
6, gcc 3.0 and codewarrior 7) and all of them _seem_ to exhibit
stability. In addition, the Dinkumware lib supplied with Visual C++ is
non-standard in that the erase() member functions "[...] return an
iterator that designates the first element remaining beyond any elements
removed, or end() if no such element exists." From some other comments i
gather that the iterator return type (similar to list::erase()) was
present in earlier drafts of the standard, but it apparently got axed
somewhere along the way. I find this puzzling. Even if the ability to
reorder equivalent elements is important to implementers, the iterator
return type would at least have allowed me to erase elements while
stepping through a container in an orderly fashion. (as an aside, the
requirement on the iterator returned by list::erase (23.1.1/7) is: "The
iterator returned from a.erase(q) points to the element immediately
following q prior to the element being erased.", which would not be good
enough in the presence of reordering).

Is there anybody that can shed some light on this issue? Any comments are
much appreciated.

Frank Compagner
frank@compagner.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]