Topic: Defect Report: std::list<>::unique incorrectly specified


Author: nospam@nospam.ucar.edu ("Thomas Mang")
Date: Tue, 14 Dec 2004 21:05:51 GMT
Raw View
"Alberto Barbati" <AlbertoBarbati@libero.it> schrieb im Newsbeitrag
news:dAdvd.19968$Lg7.660660@twister1.libero.it...
> Thomas Mang wrote:
> >
> > 3) Proposed fixes:
> >
> >
> > Change 23.2.2.4 [lib.list.ops], paragraph 19 to:
> >
> > "Effect: Erases all but the first element from every consecutive group
> > of elements, referred to by the iterator i in the range [begin(),
> > end()), for which the following conditions hold: *(i-1) == *i (for the
> > version of unique with no argument) or pred(*(i-1), *i) != false (for
> > the version of unique with a predicate argument)."
> >
> > <snip>
> >
> > a) The new wording was influenced by DR 202 and the resolutions
> > presented there. If DR 202 is resolved in another way, the proposed
> > wording need also additional review.
> > <snip>
> > e) "begin()" subs              titutesfirst,andendsubstituteslast.The
> > range need adjustment from "[first + 1, last)" to "[begin(), end())" to
> > ensure a valid range in case of an empty list.
>
> The "+1" in "first+1" is necessary to make the expression "(i-1)" valid.
> This has been taken into account in DR 202 by keeping the "+1" and
> adding the words "For nonempty ranges". To be consistent, the proposed
> wording might be:


I have thought about it quite a bit, unfortunately I missed explanation in
my DR. First, I was more attracted by the alternative resolution of DR202 [I
read the DR in a way I supposed the alternative resolution is considered to
suit the needs better. I might have been wrong regarding this], and this one
says "first", not "first + 1" [Although it also says (first, last) and not
[first, last) :-)]. I didn't use "begin() + 1" and the wording "For a
nonempty range", for two reasons:

a) I expect everyone reading the condition will immediately see that no
element can't be erased if the condition doesn't apply, and the condition
can certainly only apply (and as such will only be used without invoking UB)
if the range consists of at least 2 elements. See also the proposed
paragraph regarding complexity.
b) I couldn't really figure out what the wording "For a nonempty range" in
DR 202 is really good for. The comparison conditions does IMHO clearly
indicate what I have written in a). Note that such an (implicit) assumption
would also cover the case of a range consisting of only a single element.
The wording "For a nonempty range, however only considers empty ranges, not
ranges of a single element. And then, of course, we can't use the iterators
for comparison and we'd be back where we started.


>
> "Effect: If empty() == false, erases all but the first element from
> every consecutive group of elements, referred to by the iterator i in
> the range [begin() + 1, end()), for which the following conditions hold:
> *(i-1) == *i (for the version of unique with no argument) or
> pred(*(i-1), *i) != false (for the version of unique with a predicate
> argument)."
>
> However, I have an objection on this wording. It looks redundant to me
> to say "the first element of every consecutive group of elements" as the
> algorithm is more clearly and more exactly described by the predicates
> that follows. In fact, it's not clear from that wording whether the
> first element of the sequence is eligible to be the first element of a
> "consecutive group of elements". I would therefore simplify the sentence
> by writing:
>
> "Effect: If empty() == false, erases all elements referred to by the
> iterator i in the range [begin() + 1, end()), for which the following
> conditions hold: *(i-1) == *i (for the version of unique with no
> argument) or pred(*(i-1), *i) != false (for the version of unique with a
> predicate argument)."
>
> Notice that with the additional requirement that pred is an equivalence
> relatione (as DR 202 suggest it should be) there is no ambiguity about
> the predicate being checked before of after erasing the intermediate
> elements.
>
> Is there something I have overlooked?

I fail to see how the problem of size() == 1 is taken into account if the
assumptions my proposed wording implied do not apply. [Not sure if a wording
"If size() > 1, ..." helps, because it could still mean - without implicit
assumptions - that then end() will be dereferenced. But again, notice I am
not sure about - better say I simply don't know - how iterator range bounds
are then to be interpreted.]


Thomas


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Tue, 14 Dec 2004 23:46:02 GMT
Raw View
Thomas Mang wrote:
> "Alberto Barbati" <AlbertoBarbati@libero.it> schrieb im Newsbeitrag
>=20
> I have thought about it quite a bit, unfortunately I missed explanation=
 in
> my DR. First, I was more attracted by the alternative resolution of DR2=
02 [I
> read the DR in a way I supposed the alternative resolution is considere=
d to
> suit the needs better. I might have been wrong regarding this], and thi=
s one
> says "first", not "first + 1" [Although it also says (first, last) and =
not
> [first, last) :-)]. I didn't use "begin() + 1" and the wording "For a
> nonempty range", for two reasons:
>=20
> a) I expect everyone reading the condition will immediately see that no
> element can't be erased if the condition doesn't apply, and the conditi=
on
> can certainly only apply (and as such will only be used without invokin=
g UB)
> if the range consists of at least 2 elements. See also the proposed
> paragraph regarding complexity.

I disagree. Even assuming that we give a meaning to the expression=20
(i-1), in the case i =3D=3D begin() the expression *(i-1) produces undefi=
ned=20
behaviour. It's very different from saying that the expression "doesn't=20
apply" to i. It's plain wrong to require the evaluation of an expression=20
that have undefined value. Moreover, it's not so "immediate" to me that=20
i =3D=3D begin() is outside the scope of the algorithm. Putting a "+1" is=
=20
harmless and clarify the issue, IMHO.

> b) I couldn't really figure out what the wording "For a nonempty range"=
 in
> DR 202 is really good for. The comparison conditions does IMHO clearly
> indicate what I have written in a).

If you don't put the clause "for a nonempty range" then the expression=20
begin()+1 produces undefined behaviour, so you can't even talk about the=20
range [begin()+1, last). So you can't even talk about comparisons=20
conditions as those are logically dependent on the assumption "for all i=20
in the range [begin()+1, last)".

> Note that such an (implicit) assumption
> would also cover the case of a range consisting of only a single elemen=
t.
> The wording "For a nonempty range, however only considers empty ranges,=
 not
> ranges of a single element. And then, of course, we can't use the itera=
tors
> for comparison and we'd be back where we started.

If the list has exactly one element then the range [begin()+1, end()) is=20
empty. Any statement that begins with a "for all x in an empty set" is=20
mathematically always true. So there's no need to make a special case=20
for 1-element ranges.

>>"Effect: If empty() =3D=3D false, erases all elements referred to by th=
e
>>iterator i in the range [begin() + 1, end()), for which the following
>>conditions hold: *(i-1) =3D=3D *i (for the version of unique with no
>>argument) or pred(*(i-1), *i) !=3D false (for the version of unique wit=
h a
>>predicate argument)."
>>
>>Notice that with the additional requirement that pred is an equivalence
>>relatione (as DR 202 suggest it should be) there is no ambiguity about
>>the predicate being checked before of after erasing the intermediate
>>elements.
>>
>>Is there something I have overlooked?
>=20
> I fail to see how the problem of size() =3D=3D 1 is taken into account =
if the
> assumptions my proposed wording implied do not apply. [Not sure if a wo=
rding
> "If size() > 1, ..." helps, because it could still mean - without impli=
cit
> assumptions - that then end() will be dereferenced. But again, notice I=
 am
> not sure about - better say I simply don't know - how iterator range bo=
unds
> are then to be interpreted.]

For 1-element lists, see above. I cannot see any case in which you would=20
need to dereference end(), could you please elaborate?

Iterators bounds are to be interpreted as in mathematics, i.e.: "for all=20
i in range [a, b)" means that i can be a, a+1, a+2, ecc. up to b,=20
*excluding* b. In particular, [a, a) is the empty range so there is no=20
iterator i in range [a, a). It's all clearly stated in =A724.1/7.

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: nospam@pop.ucsd.edu ("Thomas Mang")
Date: Wed, 15 Dec 2004 05:03:07 GMT
Raw View
""Thomas Mang"" <nospam@nospam.ucar.edu> schrieb im Newsbeitrag
news:41bf3d19$0$26176$3b214f66@usenet.univie.ac.at...
>
 [Not sure if a wording
> "If size() > 1, ..." helps, because it could still mean - without implicit
> assumptions - that then end() will be dereferenced. But again, notice I am
> not sure about - better say I simply don't know - how iterator range
bounds
> are then to be interpreted.]

Okay, sorry, that's nonsense because it clearly says "the _elements_
referred to by iterator i". And there's also the complexity requirement. I
was a bit tired.....


Thomas


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Thomas Mang <a9804814@unet.univie.ac.at>
Date: Wed, 15 Dec 2004 22:58:24 CST
Raw View

Alberto Barbati schrieb:

> Thomas Mang wrote:
> > "Alberto Barbati" <AlbertoBarbati@libero.it> schrieb im Newsbeitrag
> >
> > I have thought about it quite a bit, unfortunately I missed explanation in
> > my DR. First, I was more attracted by the alternative resolution of DR202 [I
> > read the DR in a way I supposed the alternative resolution is considered to
> > suit the needs better. I might have been wrong regarding this], and this one
> > says "first", not "first + 1" [Although it also says (first, last) and not
> > [first, last) :-)]. I didn't use "begin() + 1" and the wording "For a
> > nonempty range", for two reasons:
> >
> > a) I expect everyone reading the condition will immediately see that no
> > element can't be erased if the condition doesn't apply, and the condition
> > can certainly only apply (and as such will only be used without invoking UB)
> > if the range consists of at least 2 elements. See also the proposed
> > paragraph regarding complexity.
>
> I disagree. Even assuming that we give a meaning to the expression
> (i-1), in the case i == begin() the expression *(i-1) produces undefined
> behaviour. It's very different from saying that the expression "doesn't
> apply" to i. It's plain wrong to require the evaluation of an expression
> that have undefined value. Moreover, it's not so "immediate" to me that
> i == begin() is outside the scope of the algorithm. Putting a "+1" is
> harmless and clarify the issue, IMHO.

If it clarifies the issue, then I am certainly pro it (In general, I am of the
opinion it's better to be overly explicit and clear than to leave doubts). In this
particular case, the complexity requirements seemed to be explicit enough in _my_
eyes.

Note that the same issue applies also to my DR about std::unique and the
alternative resolution of DR 202 (which, BTW, does not require the condition to be
an equivalence relation and thus attracted me a bit more :-). I think we both
agree that the exact new wording - in case there will be new wording - will depend
on how the issue of DR 202 is solved. That's why my DR explicitly referred to DR
202.


Thomas

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Thomas Mang <a9804814@unet.univie.ac.at>
Date: Sun, 12 Dec 2004 22:42:57 +0000 (UTC)
Raw View
Defect Report: std::list<T, Allocator>::unique




In Section 23.2.2.4 [lib.list.ops], paragraphs 19 to 21 describe the
behavior of the std::list<T, Allocator>::unique operation. However, the
current wording is defective for various reasons.




1) Analysis of current wording:


23.2.2.4 [lib.list.ops], paragraph 19:

Current wording says:
"Effects:  Eliminates all but the first element from every consecutive
group of equal elements referred to by the iterator i in the range
[first + 1, last) for which *i == *(i - 1) (for the version of unique
with no argument) or pred(*i, *(i -1)) (for the version of unique with a
predicate argument) holds."


This sentences makes use of the undefined term "Eliminates". Although it
is, to a certain degree, reasonable to consider the term "eliminate"
synonymous with "erase", using "Erase" in the first place, as the
wording of 23.2.2.4 [lib.list.ops], paragraph 15 does, would be clearer.

The range of the elements referred to by iterator i is "[first + 1,
last)". However, neither "first" nor "last" is defined.

The sentence makes three times use of iterator arithmetic expressions (
"first + 1", "*i == *(i - 1)", "pred(*i, *(i -1))" ) which is not
defined for bidirectional iterator [see DR submitted by Thomas Mang
regarding invalid iterator arithmetic expressions].

The same problems as pointed out in DR 202 (equivalence relation / order
of arguments for pred()) apply to this paragraph.



23.2.2.4 [lib.list.ops], paragraph 20:

Current wording says:
"Throws: Nothing unless an exception in thrown by *i == *(i-1) or
pred(*i, *(i - 1))"


The sentence makes two times use of invalid iterator arithmetic
expressions ( "*i == *(i - 1)", "pred(*i, *(i -1))" ).

[Note: Minor typos: "in" / missing dot at end of sentence.]



23.2.2.4 [lib.list.ops], paragraph 21:

Current wording says:
"Complexity: If the range (last - first) is not empty, exactly (last -
first) - 1 applications of the corresponding predicate, otherwise no
application of the predicate.


See DR 315 regarding "(last - first)" not yielding a range.

Invalid iterator arithmetic expression "(last - first) - 1" left .




2) Description of intended behavior:


For the rest of this Defect Report, it is assumed that "eliminate" is
supposed to be synonymous to "erase", that "first" is equivalent to an
iterator obtained by a call to begin(), "last" is equivalent to an
iterator obtained by a call to end(), and that all invalid iterator
arithmetic expressions are resolved as described in DR submitted by
Thomas Mang regarding invalid iterator arithmetic expressions.

Furthermore, the resolutions of DR 202 are considered regarding
equivalence relation and order of arguments for a call to pred.

All implementations known to the author of this Defect Report comply
with these assumptions, apart from the impact of the alternative
resolution of DR 202. Except for the changes implied by the resolutions
of DR 202, no impact on current code is expected.




3) Proposed fixes:


Change 23.2.2.4 [lib.list.ops], paragraph 19 to:

"Effect: Erases all but the first element from every consecutive group
of elements, referred to by the iterator i in the range [begin(),
end()), for which the following conditions hold: *(i-1) == *i (for the
version of unique with no argument) or pred(*(i-1), *i) != false (for
the version of unique with a predicate argument)."


Comments to the new wording:

a) The new wording was influenced by DR 202 and the resolutions
presented there. If DR 202 is resolved in another way, the proposed
wording need also additional review.
b) "Erases" refers in the author's opinion unambiguously to the member
function "erase". In case there is doubt this might not be unamgibuous,
a direct reference to the member function "erase" is suggested [Note:
This would also imply a change of 23.2.2.4 [lib.list.ops], paragraph
15.].
c) The expression "(i - 1)" was left, but is expected that DR submitted
by Thomas Mang regarding invalid iterator arithmetic expressions will
take this into account.
d) The wording "(for the version of unique with no argument)" and "(for
the version of unique with a predicate argument)" was kept consciously
for clarity.
e) "begin()" substitutes "first", and "end()" substitutes "last". The
range need adjustment from "[first + 1, last)" to "[begin(), end())" to
ensure a valid range in case of an empty list.
f) If it is considered that the wording is unclear whether it declares
the element of a group which consists of only a single element
implicitly to be the first element of this group [Note: Such an
interpretation could eventually arise especially in case size() == 1] ,
the following additional sentence is proposed: "If such a group of
elements consists of only a single element, this element is also
considered the first element."



Change 23.2.2.4 [lib.list.ops], paragraph 20 to:

"Throws: Nothing unless an exception is thrown by *(i-1) == *i or
pred(*(i-1), *i)."


Comments to the new wording:

a) The wording regarding the conditions is identical to proposed
23.2.2.4 [lib.list.ops], paragraph 19. If 23.2.2.4 [lib.list.ops],
paragraph 19 is resolved in another way, the proposed wording need also
additional review.
b) The expression "(i - 1)" was left, but is expected that DR submitted
by Thomas Mang regarding invalid iterator arithmetic expressions will
take this into account.
c) Typos fixed.



Change 23.2.2.4 [lib.list.ops], paragraph 21 to:

"Complexity: If empty() == false, exactly size() - 1 applications of the
corresponding predicate, otherwise no applications of the corresponding
predicate."


Comments to the new wording:

a) The new wording is supposed to also replace the proposed resolution
of DR 315, which suffers from the problem of undefined "first" / "last".





5) References to other DRs:


See DR 202.
See DR 239.
See DR 315.
See DR submitted by Thomas Mang regarding invalid iterator arithmetic
expressions.





Thomas Mang
December 12th, 2004



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Alberto Barbati <AlbertoBarbati@libero.it>
Date: Mon, 13 Dec 2004 22:28:41 CST
Raw View
Thomas Mang wrote:
>
> 3) Proposed fixes:
>
>
> Change 23.2.2.4 [lib.list.ops], paragraph 19 to:
>
> "Effect: Erases all but the first element from every consecutive group
> of elements, referred to by the iterator i in the range [begin(),
> end()), for which the following conditions hold: *(i-1) == *i (for the
> version of unique with no argument) or pred(*(i-1), *i) != false (for
> the version of unique with a predicate argument)."
>
> <snip>
>
> a) The new wording was influenced by DR 202 and the resolutions
> presented there. If DR 202 is resolved in another way, the proposed
> wording need also additional review.
> <snip>
> e) "begin()" substitutes "first", and "end()" substitutes "last". The
> range need adjustment from "[first + 1, last)" to "[begin(), end())" to
> ensure a valid range in case of an empty list.

The "+1" in "first+1" is necessary to make the expression "(i-1)" valid.
This has been taken into account in DR 202 by keeping the "+1" and
adding the words "For nonempty ranges". To be consistent, the proposed
wording might be:

"Effect: If empty() == false, erases all but the first element from
every consecutive group of elements, referred to by the iterator i in
the range [begin() + 1, end()), for which the following conditions hold:
*(i-1) == *i (for the version of unique with no argument) or
pred(*(i-1), *i) != false (for the version of unique with a predicate
argument)."

However, I have an objection on this wording. It looks redundant to me
to say "the first element of every consecutive group of elements" as the
algorithm is more clearly and more exactly described by the predicates
that follows. In fact, it's not clear from that wording whether the
first element of the sequence is eligible to be the first element of a
"consecutive group of elements". I would therefore simplify the sentence
by writing:

"Effect: If empty() == false, erases all elements referred to by the
iterator i in the range [begin() + 1, end()), for which the following
conditions hold: *(i-1) == *i (for the version of unique with no
argument) or pred(*(i-1), *i) != false (for the version of unique with a
predicate argument)."

Notice that with the additional requirement that pred is an equivalence
relatione (as DR 202 suggest it should be) there is no ambiguity about
the predicate being checked before of after erasing the intermediate
elements.

Is there something I have overlooked?

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]