Topic: question about std::remove / std::remove_if


Author: hinnant@metrowerks.com (Howard Hinnant)
Date: Mon, 29 Nov 2004 17:57:12 GMT
Raw View
In article <41aa4cd4$0$12130$3b214f66@usenet.univie.ac.at>,
 nospam@nospam.ucar.edu ("Thomas Mang") wrote:

> Hello,
>
> 25.2.7/2 says that all elements referred to by an iterator in the range
> [first, last) for which the certain conditions hold will be "eliminated".
>
> Now this word "eliminates" irritates me.
>
> I checked some popular implementations, and what they do is basically
> shuffle elements not holding the condition to the start of the range, and
> leaving remaining elements as they are - no matter if the condition applies
> or not.
>
> However, this behaviour is not how I interpret the standard. For me
> "eliminates" means basically that after applying the algorithm, no element
> in the range [first, last) holds the condition.
>
>
> My question:
> Am I missing something (e.g. a specification of "eliminate"), or are the
> implementations I checked broken, or is there an unclear wording in the
> Standard with possible Defect?
>
> Note:
> I also understand the implications my interpretation would have: What if all
> elements in the range [first, last) will hold the condition? What values
> should then be used to replace them?

Perhaps there is a better word than "eliminate" to describe the
behavior, but I don't really see a defect here.  I became a std::lib
implementor after the standard was ratified, and had no trouble
interpreting what was meant, and implemented it.

This is an "in place" modifying sequence algorithm which transforms the
range [first, last) into a new range [first, returned_value) where every
element in this new range does not satisfy the predicate (or not equal
to the given value).  Furthermore the transformation is stable in that
elements in the new range have the same relative order that they had in
the old range.  I believe paragraphs 2-4 unambiguously say all of this.

The values left in the range [returned_value, last) (if not an empty
range) are indeterminate.

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "dietmar_kuehl@yahoo.com" <dietmar_kuehl@yahoo.com>
Date: Mon, 29 Nov 2004 21:42:15 CST
Raw View
"Thomas Mang" wrote:
> Hello,
>
> 25.2.7/2 says that all elements referred to by an iterator in the
range
> [first, last) for which the certain conditions hold will be
"eliminated".

You should also read 25.3.7/3 which reads "Returns: The end of the
resulting range." That is, essentially 'remove()' and 'remove_if()'
shuffle the appropriate elements into the first elements and return
a past the end iterator of a subsequence.

> However, this behaviour is not how I interpret the standard. For me
> "eliminates" means basically that after applying the algorithm, no
element
> in the range [first, last) holds the condition.

Nope, this is neither the intention nor the wording: you get a new
sequence which ends with the iterator being returned. This new sequence
only contains elements not matching the condition.
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.contendix.com> - Software Development & Consulting

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Tue, 30 Nov 2004 03:45:55 GMT
Raw View
"Howard Hinnant" <hinnant@metrowerks.com> schrieb im Newsbeitrag
news:hinnant-072E8E.17412528112004@syrcnyrdrs-01-ge0.nyroc.rr.com...
> In article <41aa4cd4$0$12130$3b214f66@usenet.univie.ac.at>,
>  nospam@nospam.ucar.edu ("Thomas Mang") wrote:
>
> > Hello,
> >
> > 25.2.7/2 says that all elements referred to by an iterator in the range
> > [first, last) for which the certain conditions hold will be
"eliminated".
> >
> > Now this word "eliminates" irritates me.
> >
> > I checked some popular implementations, and what they do is basically
> > shuffle elements not holding the condition to the start of the range,
and
> > leaving remaining elements as they are - no matter if the condition
applies
> > or not.
> >
> > However, this behaviour is not how I interpret the standard. For me
> > "eliminates" means basically that after applying the algorithm, no
element
> > in the range [first, last) holds the condition.
> >
> >
> > My question:
> > Am I missing something (e.g. a specification of "eliminate"), or are the
> > implementations I checked broken, or is there an unclear wording in the
> > Standard with possible Defect?
> >
> > Note:
> > I also understand the implications my interpretation would have: What if
all
> > elements in the range [first, last) will hold the condition? What values
> > should then be used to replace them?
>
> Perhaps there is a better word than "eliminate" to describe the
> behavior, but I don't really see a defect here.  I became a std::lib
> implementor after the standard was ratified, and had no trouble
> interpreting what was meant, and implemented it.

Well, if I read books (and I have done so well before I read the Standard
regarding this), it seems authors had no problems figuring out what was
meant. But by no means I can find this backed by the wording of the Standard


>
> This is an "in place" modifying sequence algorithm which transforms the
> range [first, last) into a new range [first, returned_value) where every
> element in this new range does not satisfy the predicate (or not equal
> to the given value).  Furthermore the transformation is stable in that
> elements in the new range have the same relative order that they had in
> the old range.  I believe paragraphs 2-4 unambiguously say all of this.

I disagree.

para 2 says "Eliminates all the elements referred to by iterator i in the
range [first, last) for which...."

Maybe I have a totally wrong feeling about the meaning of the English term
"eliminating" (English is not my mother tongue, so please forgive me; but I
double-checked with a very reliable dictionary and it seems I have a correct
understanding of the term), but in combination with the range bounds I read
this sentence as after applying the algorithm, no element in the range
[first, last) will hold the condition - independent of the return-value
iterator.


Next issue:

I cannot even find a guarantee that elements not holding the condition will
be moved to the beginning and be in contiguous order (Only guarantee is that
the relative order of them remains the same).

Take this sequence:
1, 2, 3, 6, 7, 8, 9

And then remove all elements < 4. Then I believe this resulting sequence
(and returning a copy of the last-iterator parameter) is valid:
6, 6, 6, 6, 7, 8, 9

(I took 6 as substitute value here so the complexity requirements are
fullfilled).



Comments are highly welcomed.
If I am missing something, I'd appreciate any pointers to the right
direction. Otherwise I think a new wording is necessary.


Thomas



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: iapetus@austin.rr.com (iapetus)
Date: Tue, 30 Nov 2004 18:26:16 GMT
Raw View
> 25.2.7/2 says that all elements referred to by an iterator in the range
> [first, last) for which the certain conditions hold will be "eliminated".
>
> Now this word "eliminates" irritates me.
>
> I checked some popular implementations, and what they do is basically
> shuffle elements not holding the condition to the start of the range, and
> leaving remaining elements as they are - no matter if the condition applies
> or not.

I would agree that the terms are a bit confusing -- the remove
algorithms don't actually remove anything from the container!  But
the point is, the remove algorithms *can't* actually remove an
element from a container -- only the container's member functions
can do so, and the remove algorithms can't call the container member
functions because they are only given iterators to the container, not
the container itself.

The remove algorithms are almost always used in conjunction with the
container's erase member functions.

If you have a vector<int> v and want to truly erase all 0 values, you
should write:

v.erase(remove(v.begin(), v.end(), 0), v.end());

Then there will be no 0's in the range [v.begin(), v.end()] and
v.size() will have decreased by the number of 0's that were
originally in v.

For more details, see Scott Meyer's 'Effective STL', item 32.

-- Mickey Moore

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kanze@gabi-soft.fr
Date: Wed, 1 Dec 2004 20:23:39 GMT
Raw View
"dietmar_kuehl@yahoo.com" <dietmar_kuehl@yahoo.com> wrote in message
news:<1101738111.632743.244690@f14g2000cwb.googlegroups.com>...

> "Thomas Mang" wrote:

> > 25.2.7/2 says that all elements referred to by an iterator in the
> > range [first, last) for which the certain conditions hold will be
> > "eliminated".

> You should also read 25.3.7/3 which reads "Returns: The end of the
> resulting range." That is, essentially 'remove()' and 'remove_if()'
> shuffle the appropriate elements into the first elements and return a
> past the end iterator of a subsequence.

I think that Thomas knows what the function is supposed to do.  His
problem is that the words in the standard don't describe what we know
should be the behavior.  There's nothing in the standard about "moving"
elements, or reordering anything, just about "eliminating" or
"removing".  And each time there is a mention of "eliminate" or
"remove", the range given is [first,last).  What the current wording
actually says is that after calling remove, I can iterate over
[first,last) and not find any element for which *i == value.  In fact,
of course, this post-condition only holds for [first, returnValue).

> > However, this behaviour is not how I interpret the standard. For me
> > "eliminates" means basically that after applying the algorithm, no
> > element in the range [first, last) holds the condition.

> Nope, this is neither the intention nor the wording: you get a new
> sequence which ends with the iterator being returned. This new
> sequence only contains elements not matching the condition.

Where does it say that?  I think his example was correct.  Given a
sequence:

    first                   last
      |                       |
      V                       V
      1   2   3   6   7   8   9

a function which changes it into

    first                   last
      |                       |
      V                       V
      6   6   6   6   7   8   9

and returns last has met the requirements specified in the standard.

Note that this would be the case even if we "reinterpret" eliminate to
mean something like rearrange so that all of the valid elements are in a
single sequence.  The only thing this implementation does that is
different from current implementations is put the resulting sequence at
the end, rather than at the beginning.  Where is there anything that
forbids that?

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kanze@gabi-soft.fr
Date: Thu, 2 Dec 2004 05:37:29 GMT
Raw View
nospam@nospam.ucar.edu ("Thomas Mang") wrote in message
news:<41ab73e2$0$8274$3b214f66@usenet.univie.ac.at>...
> "Howard Hinnant" <hinnant@metrowerks.com> schrieb im Newsbeitrag
> news:hinnant-072E8E.17412528112004@syrcnyrdrs-01-ge0.nyroc.rr.com...
> > In article <41aa4cd4$0$12130$3b214f66@usenet.univie.ac.at>,
> >  nospam@nospam.ucar.edu ("Thomas Mang") wrote:

> > > 25.2.7/2 says that all elements referred to by an iterator in the
> > > range [first, last) for which the certain conditions hold will be
> > > "eliminated".

> > > Now this word "eliminates" irritates me.

> > > I checked some popular implementations, and what they do is
> > > basically shuffle elements not holding the condition to the start
> > > of the range, and leaving remaining elements as they are - no
> > > matter if the condition applies or not.

> > > However, this behaviour is not how I interpret the standard. For
> > > me "eliminates" means basically that after applying the algorithm,
> > > no element in the range [first, last) holds the condition.

> > > My question:
> > > Am I missing something (e.g. a specification of "eliminate"), or
> > > are the implementations I checked broken, or is there an unclear
> > > wording in the Standard with possible Defect?

> > > Note:
> > > I also understand the implications my interpretation would have:
> > > What if all elements in the range [first, last) will hold the
> > > condition? What values should then be used to replace them?

> > Perhaps there is a better word than "eliminate" to describe the
> > behavior, but I don't really see a defect here.  I became a std::lib
> > implementor after the standard was ratified, and had no trouble
> > interpreting what was meant, and implemented it.

> Well, if I read books (and I have done so well before I read the
> Standard regarding this), it seems authors had no problems figuring
> out what was meant.  But by no means I can find this backed by the
> wording of the Standard

Most people don't learn about what the functions should do by reading
the standard.  And if you know what it should do...

But I agree with you.  The standard doesn't give any special meaning to
eliminate, and if we apply its everyday meaning, then    25.2.7 basically
says that a post-condition of the function is that for all i in [first,
last), !(*i == value) or !pred(*i).  Which, of course, isn't the intend
(and isn't what anyone implements, nor is it even something that could
reasonably be implemented).

> > This is an "in place" modifying sequence algorithm which transforms
> > the range [first, last) into a new range [first, returned_value)
> > where every element in this new range does not satisfy the predicate
> > (or not equal to the given value).  Furthermore the transformation
> > is stable in that elements in the new range have the same relative
> > order that they had in the old range.  I believe paragraphs 2-4
> > unambiguously say all of this.

> I disagree.

Me too.  Note the words in paragraph 4: "the relative order of the
elements that are not removed [...]".  So do we remove elements, or not?
In practice, we do not remove elements; we cannot (any more than we can
eliminate them -- and why do we "eliminate" in paragraph 2, but "remove"
in paragraph 4?).  And until the actual semantics are defined, paragraph
4 doesn't mean anything.

I'll admit that finding appropriate words (without overspecifying) for
such a wierd function isn't easy.  Intuitively, I"d go with something
like:

    Reorders the elements in the range [first, last) so that after the
    function:
     -- for all elements in [first, return), !pred(*) (or *i != value),
        and
     -- all elements in [first, return) are in their original order.

Note, however, that this gives an added post-condition which isn't
currently present: that the elements removed from the beginning are
copied to the end.  This is sometimes useful, but has added runtime
cost.

At any rate, words like "eliminate" or "remove" have no place in a
description of the behavior of this function.

> para 2 says "Eliminates all the elements referred to by iterator i in
> the range [first, last) for which...."

> Maybe I have a totally wrong feeling about the meaning of the English
> term "eliminating" (English is not my mother tongue, so please forgive
> me; but I double-checked with a very reliable dictionary and it seems
> I have a correct understanding of the term), but in combination with
> the range bounds I read this sentence as after applying the algorithm,
> no element in the range [first, last) will hold the condition -
> independent of the return-value iterator.

In the absense of some special definition in the standard, that's what
it says.  I don't think that there is any way it could be interpreted
otherwise.

> Next issue:

> I cannot even find a guarantee that elements not holding the condition
> will be moved to the beginning and be in contiguous order (Only
> guarantee is that the relative order of them remains the same).

> Take this sequence:
> 1, 2, 3, 6, 7, 8, 9

> And then remove all elements < 4. Then I believe this resulting sequence
> (and returning a copy of the last-iterator parameter) is valid:
> 6, 6, 6, 6, 7, 8, 9

This is partially, I think, because there is no definition of "the
resulting range" referred to in paragraph 3.  Once this is corrected,
the paragraph could easily be modified to read: "the relative order of
the elements in the resulting range is the same as their relative order
in the original range."

And of course, stable does have a special meaning in computer science,
so arguably, once we define what we are talking about, the simple
presence of that word would suffice.

> (I took 6 as substitute value here so the complexity requirements are
> fullfilled).

> Comments are highly welcomed.  If I am missing something, I'd
> appreciate any pointers to the right direction. Otherwise I think a
> new wording is necessary.

There's no doubt in my mind that the current wording doesn't describe
the function correctly.  I find it difficult to find words that do,
however.

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Fri, 3 Dec 2004 04:27:07 GMT
Raw View
<kanze@gabi-soft.fr> schrieb im Newsbeitrag >
> > Comments are highly welcomed.  If I am missing something, I'd
> > appreciate any pointers to the right direction. Otherwise I think a
> > new wording is necessary.
>
> There's no doubt in my mind that the current wording doesn't describe
> the function correctly.  I find it difficult to find words that do,
> however.


Just writing my first DR :-)

Moderators: Don't pass it forward to the committee. This is only a sketch.


Me also experiencing difficulties to find the right words - these algorithms
are real beasts to define.

I came up with this regarding remove/remove_if:


change 25.2.7/2 to:
Effect: Places all the elements referred to by iterator i in the range
[first, last) for which the following corresponding conditions hold: *i !=
value, pred(*i) == false into the subrange [first, k) of the original range.
k shall denote an iterator so that the distance between first and k shall be
equal to the number of elements in the original range for which the
corresponding conditions did hold. It is unspecified whether any element in
the resulting subrange [k, last) will hold the corresponding conditions, or
not.


The part "of the original range" might be redundant - I added it for
clarity.
I introduced the new iterator k to describe the function properly and use it
as return value in the next paragraph. (Any wording about a return value in
this paragraph would IMHO introduce a cyclic reference to the paragraph
about the return value).
I added the sentence with the distance so [first, k) cannot also contain
extra copies of an element.In other words, if the original sequence was
this:
1, 2, 3, 4
and the value 3 shall be removed, then this sequence
1, 2, 2, 4
is prohibited explicitly.




change 25.2.7/3 to:
Returns: The iterator k.

change 25.2.7/4 to:
Notes: Stable: the relative order of the elements that are placed into the
subrange [first, return value) shall be the same as their relative order was
in the original range [first, last).


I am not 100% happy with the wording, but I cannot come up with something
better neither overspecifying nor underspecifying the algorithm.
Talented writers and/or native English speakers will have an advantage here.




The next issue:

I am afraid we also need a Defect for unique / unique_if.

The wording contains more or less the same problems as remove / remove_if:
Again the wonderful term "eliminate", and the never specified "resulting
range" as return value. In short, we end up in the same trouble.

There is even an additional problem:
The current wording expresses the condition as "*i == *(i - 1)". As you see,
calculating offset between 2 iterators. Unfortunately, ForwardIterators
don't support this.
I tried to consider this in my suggested wording.


I found finding the right wording even more difficult than for remove /
remove_if. Here is what I came up with:


change 25.2.8/1 to:
Effect: Places the first element from every consecutive group of equal
elements, referred to by the iterator i in the range [first, last), into the
subrange [first, k) of the original range. k shall denote an iterator so
that the distance between first and k shall be equal to the number of groups
of consecutive equal elements of the original range. Two consecutive
elements are considered equal if *j == *i or pred(*j, *i) != false, where j
shall be an iterator referring to the element immediately subsequent to the
element referred to by iterator i.


explanation:
k and the sentence about the distance fulfills the same task as the
corresponding one in remove/remove_if.
I had to completely rewrite the whole paragraph, because I could not come up
with a wording about anything with the meaning of eliminating / removing as
the current wording suggests. A wording about placing certain elements to
the front was the only one that came to my mind and suits the needs.

I tried something like "Places every element which is not followed by an
equal element or every first element from every consecutive group of equal
elements....." but this opened a hole, because the very last element is not
followed by an equal element and thus be represented in the new sequence,
whether it is followed by an equal element or not.

I don't think a wording about the elements remaining in the range [k, last)
is necessary.


change 25.2.8/2 to:
Returns: The iterator k.



Again, I hope I did not overspecify it, underspecify it or made some fatal
errors.



I'd like to hear about any suggestions / improvements / clarifications /
redundancy removements in case you dislike the wording.


Thomas




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Sun, 28 Nov 2004 22:22:55 GMT
Raw View
Hello,

25.2.7/2 says that all elements referred to by an iterator in the range
[first, last) for which the certain conditions hold will be "eliminated".

Now this word "eliminates" irritates me.

I checked some popular implementations, and what they do is basically
shuffle elements not holding the condition to the start of the range, and
leaving remaining elements as they are - no matter if the condition applies
or not.

However, this behaviour is not how I interpret the standard. For me
"eliminates" means basically that after applying the algorithm, no element
in the range [first, last) holds the condition.


My question:
Am I missing something (e.g. a specification of "eliminate"), or are the
implementations I checked broken, or is there an unclear wording in the
Standard with possible Defect?

Note:
I also understand the implications my interpretation would have: What if all
elements in the range [first, last) will hold the condition? What values
should then be used to replace them?


Thomas



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]