Topic: Defect Report: std::unique wrongly specified
Author: Thomas Mang <a9804814@unet.univie.ac.at>
Date: Sun, 12 Dec 2004 22:42:56 +0000 (UTC) Raw View
Defect Report: std::unique
In Section 25.2.8 [lib.alg.unique], paragraphs 1 to 3 describe the
behavior of the mutating sequence operation std::unique. However, the
wording does not reflect the intended behavior [Note: See definition of
intended behavior below] of these algorithms, as it is known to the C++
community [1].
1) Analysis of current wording:
25.2.8 [lib.alg.unique], paragraph 1:
Current wording says:
"Effects: Eliminates all but the first element from every consecutive
group of equal elements referred to by the iterator i in the range
[first, last) for which the following corresponding conditions hold: *i
== *(i - 1) or pred(*i, *(i -1)) != false"
This sentences expresses specifically that all elements denoted by the
(original) range [first, last) which are not but the first element from
a consecutive group of equal elements (where equality is defined as *i
== *(i - 1) or pred(*i, *(i - 1)) ! = false) [Note: See DR 202], call
them r-elements [Note: r...remove], will be eliminated. Since there is
no formal definition of the term "eliminate" provided, it is undefined
how this "elimination" takes place. But the meaning of "eliminate" in
everyday language seems to disallow explicitly that after application of
the algorithm, any r-element will remain at any position of the range
[first, last) [2].
Another defect in the current wording concerns the iterators used to
compare two elements for equality: The current wording contains the
expression "(i - 1)", which is not covered by 25/9 [Note: See DR
submitted by Thomas Mang regarding invalid iterator arithmetic
expressions].
25.2.8 [lib.alg.unique], paragraph 2:
Current wording says:
"Returns: The end of the resulting range."
The resulting range is not specified. In combination with 25.2.8
[lib.alg.unique], paragraph 1, one reasonable interpretation (in the
author's opinion even the only possible interpretation) of this
so-called resulting range is the range [first, last) - thus returning
always the ForwardIterator 'last' parameter.
2) Description of intended behavior:
For the rest of this Defect Report, it is assumed that the intended
behavior was that all elements denoted by the original range [first,
last) which are the first element from a consecutive group of elements
for which the corresponding conditions: *(i-1) == *i (for the version of
unique without a predicate argument) or pred(*(i-1), *i) ! = false (for
the version of unique with a predicate argument) [Note: If such a group
of elements consists of only a single element, this is also considered
the first element] [Note: See resolutions of DR 202], call them
s-elements [Note: s...stay], will be placed into a contiguous subrange
of [first, last), denoted by the iterators [first, return value). The
number of elements in the resulting range [first, return value) shall be
equal to the number of s-elements in the original range [first, last).
Invalid iterator arithmetic expressions are expected to be resolved as
proposed in DR submitted by Thomas Mang regarding invalid iterator
arithmetic expressions. It is also assumed by the author that the
relative order of the elements in the resulting subrange [first, return
value) shall be the same as the relative order of the corresponding
elements (the s-elements) in the original range [Note: If this was not
intended behavior, the additional proposed paragraph about stable order
will certainly become obsolete].
Furthermore, the resolutions of DR 202 are partially considered.
All implementations known to the author of this Defect Report comply
with this intent [Note: Except possible effects of DR 202]. Since this
intent of the behavior (contrary to the current wording) is also
described in various utility references serving the C++ community [1],
it is not expected that fixing the paragraphs will influence current
code [Note: Except possible effects of DR 202] - unless the code relies
on the behavior as it is described by current wording and the
implementation indeed reflects the current wording, and not the intent.
3) Proposed fixes:
Change 25.2.8 [lib.alg.unique], paragraph 1 to:
"Effect: Places the first element from every consecutive group of
elements, referred to by the iterator i in the range [first, last), for
which the following conditions hold: *(i-1) == *i (for the version of
unique without a predicate argument) or pred(*(i -1), *i) != false (for
the version of unique with a predicate argument), into the subrange
[first, k) of the original range, where k shall denote a value of type
ForwardIterator."
Comments to the new wording:
a) The new wording was influenced by the resolutions of DR 202. If DR
202 is resolved in another way, the proposed wording need also
additional review.
b) "Places" has no special meaning, and the everyday language meaning
should fit.
c) The expression "(i - 1)" was left, but is expected that DR submitted
by Thomas Mang regarding invalid iterator arithmetic expressions will
take this into account.
d) The wording "(for the version of unique without a predicate
argument)" and "(for the version of unique with a predicate argument)"
was added consciously for clarity and is in resemblence with current
23.2.2.4 [lib.list.ops], paragraph 19. It might be considered redundant.
e) The wording "of the original range" might be redundant, since any
subrange starting at first and containing no more elements than the
original range is implicitly a subrange of the original range [first,
last).
f) The iterator k was introduced instead of "return value" in order to
avoid a cyclic dependency on 25.2.8 [lib.alg.unique], paragraph 2. The
wording ", where k shall denote a value of type ForwardIterator" might
be redundant, because it follows implicitly by 25.2.8 [lib.alg.unique],
paragraph 2.
g) "Places" does, in the author's opinion, explicitly forbid duplicating
any s-element in the original range [first, last) within the resulting
range [first, k). If there is doubt this term might be not unambiguous
regarding this, it is suggested that k is specified more closely by the
following wording: "k shall denote a value of type ForwardIterator
[Note: See f)] so that k - first is equal to the number of elements in
the original range [first, last) being the first element from every
consecutive group of elements for which the corresponding condition did
hold". This could also be expressed as a separate paragraph
"Postcondition:".
h) If it is considered that the wording is unclear whether it declares
the element of a group which consists of only a single element
implicitly to be the first element of this group [Note: Such an
interpretation could eventually arise especially in case last - first ==
1] , the following additional sentence is proposed: "If such a group of
elements consists of only a single element, this element is also
considered the first element."
Change 25.2.8 [lib.alg.unique], paragraph 2 to:
"Returns: The iterator k."
Add a separate paragraph "Notes:" as 25.2.8 [lib.alg.unique], paragraph
2a or 3a, or a separate paragraph "Postcondition:" before 25.2.8
[lib.alg.unique], paragraph 2 (wording inside {} shall be eliminated if
the preceding expressions are used, or the preceding expressions shall
be eliminated if wording inside {} is used):
"Notes:{Postcondition:} Stable: the relative order of the elements that
are placed into the subrange [first, return value {k}) shall be the same
as their relative order was in the original range [first, last) prior to
application of the algorithm."
Comments to the new wording:
a) It is assumed by the author that the algorithm was intended to be
stable.
In case this was not the intent, this paragraph becomes certainly
obsolete.
b) The wording "was ... prior to application of the algorithm" is used
to explicitly distinguish the original range not only by means of
iterators, but also by a 'chronological' factor from the resulting range
[first, return value). It might be redundant.
25.2.8 [lib.alg.unique], paragraph 3:
See DR 239.
4) References to other DRs:
See DR 202, but which does not address any of the problems described in
this Defect Report [Note: This DR is supposed to complement DR 202].
See DR 239.
See DR submitted by Thomas Mang regarding invalid iterator arithmetic
expressions.
[1]:
The wording of these references is not always unambiguous, and provided
examples partially contradict verbal description of the algorithms,
because the verbal description resembles the problematic wording of
ISO/IEC 14882:2003.
[2]:
Illustration of conforming implementations according to current wording:
One way the author of this DR considers how this "elimination" could be
achieved by a conforming implementation according to current wording is
by substituting each r-element by _any_ s-element [Note: s...stay; any
non-r-element], since all r-elements are "eliminated".
In case of a sequence consisting of elements being all 'equal' [Note:
See DR 202], substituting each r-element by the single s-element is the
only possible solution according to current wording.
Thomas Mang
December 12th, 2004
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]