Topic: Irregular iterators and the Ranges feature
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Wed, 27 Jun 2018 08:09:43 -0700 (PDT)
Raw View
------=_Part_49456_1940335030.1530112183460
Content-Type: multipart/alternative;
boundary="----=_Part_49457_986095933.1530112183460"
------=_Part_49457_986095933.1530112183460
Content-Type: text/plain; charset="UTF-8"
C++ lays down a number of iterator requirements. The Ranges TS being merged
into C++20 formalizes (and modifies) them into Concepts. Most iterators in
the standard library follow the named requirements, and most of them will
also follow the Range Concepts.
But there are a few iterators that never did follow the named requirements.
Or more specifically, they told a falsehood about the kind of iterator they
were.
The most infamous case is `vector<bool>::iterator`. The ForwardIterator
requirement requires that `reference` must be `value_type&` (or `const
value_type&`), which `vector<bool>::iterator` doesn't fulfill. It is a
proxy iterator; its `reference` type is a object convertible to `bool`.
The thing is, the Range TS provides support for proxy iterators. The
ForwardIterator *Concept* introduced by Ranges makes this a legal iterator.
And if you want to iterate over the elements of a `vector<bool>` and modify
the elements, you do it like this: `for(auto &&b : vecbool)`. That would
work for any kind of modifiable range. This is the common idiom that C++20
will be supporting.
I bring this up because there are a number of other irregular iterators
that *don't* get fixed by the Range TS's concepts. Indeed,
`vector<bool>::iterator` is the only one that wasn't technically a proper
iterator which becomes one.
The simplest case of a non-fixed iterator is `filesystem::path::iterator`.
This is ostensibly a BidirectionalIterator, a more specialized version of
ForwardIterator. But ForwardIterator requires that an iterator provide
multipass capabilities. Specifically, the values in the sequence accessed
through an iterator must be independent of the value of the iterator. If I
get a reference to an element in the sequence, merely changing the iterator
that produce it *must not* affect that reference.
`filesystem::path::iterator` doesn't work that way. Indeed, the standard
specifically calls this out <http://eel.is/c++draft/fs.path.itr>when it
claims that it is a BidirectionalIterator.
Now, why does `path::iterator` not provide multipass? This is because it is
an attempt to fix the problem of proxy iterator by creating a different
type of iterator: a so-called "stashing" iterator. The iterator actually
stores a value_type within itself, so that `operator*` can return a proper
reference (returning values from `operator*` is what makes an iterator a
proxy iterator). But by doing so, you break the multipass requirements of
ForwardIterator.
Because of the long-standing issue with proxy iterators, there are actually
a fair number of stashing iterators in the standard: `path::iterator`,
`regex_[token]_iterator`, and probably others. The stream iterators dodge
this issue because they are only Input/OutputIterators, and those have no
expectation of multipass.
Since proxy iterators are now legal under the Ranges Concepts, it would
make sense to convert these into proxy iterators: their `reference` type is
the same as their `value_type`. Of course, that is a breaking change;
anyone who ever did `auto &val = *it;` will have their code broken. The
Ranges idiom is to use `auto &&` to capture a (possibly modifiable)
reference to an element.
Now, we can easily add proper Range-based iterators for regexes (or more
specifically, add a proper range type for regex iteration that uses the
iterator/sentinel paradigm). But `path::iterator` is a part of `path`'s
natural range interface. I'm not sure it's a good idea to mess with that.
At the same time, because the multipass requirement of the ForwardIterator
range Concept is not something which can be detected by the concept
requirements, there is no way for a piece of code to recognize that
`path::iterator` isn't actually a ForwardIterator. This makes it somewhat
fragile to use.
Then again, since `path::iterator` never really fulfilled the
BidirectionalIterator requirements, maybe it was always fragile to use.
That would mean fixing it would be resolving a defect.
Any ideas?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/b74f25d2-4852-40cc-8c5a-b6a2359ad1d0%40isocpp.org.
------=_Part_49457_986095933.1530112183460
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div>C++ lays down a number of iterator requirements. The =
Ranges TS being merged into C++20 formalizes (and modifies) them into Conce=
pts. Most iterators in the standard library follow the named requirements, =
and most of them will also follow the Range Concepts.</div><div><br></div><=
div>But there are a few iterators that never did follow the named requireme=
nts. Or more specifically, they told a falsehood about the kind of iterator=
they were.</div><div><br></div><div>The most infamous case is `vector<b=
ool>::iterator`. The ForwardIterator requirement requires that `referenc=
e` must be `value_type&` (or `const value_type&`), which `vector<=
;bool>::iterator` doesn't fulfill. It is a proxy iterator; its `refe=
rence` type is a object convertible to `bool`.</div><div><br></div><div>The=
thing is, the Range TS provides support for proxy iterators. The ForwardIt=
erator <i>Concept</i> introduced by Ranges makes this a legal iterator. And=
if you want to iterate over the elements of a `vector<bool>` and mod=
ify the elements, you do it like this: `for(auto &&b : vecbool)`. T=
hat would work for any kind of modifiable range. This is the common idiom t=
hat C++20 will be supporting.</div><div><br></div><div>I bring this up beca=
use there are a number of other irregular iterators that <i>don't</i> g=
et fixed by the Range TS's concepts. Indeed, `vector<bool>::itera=
tor` is the only one that wasn't technically a proper iterator which be=
comes one.</div><div><br></div><div>The simplest case of a non-fixed iterat=
or is `filesystem::path::iterator`. This is ostensibly a BidirectionalItera=
tor, a more specialized version of ForwardIterator. But ForwardIterator req=
uires that an iterator provide multipass capabilities. Specifically, the va=
lues in the sequence accessed through an iterator must be independent of th=
e value of the iterator. If I get a reference to an element in the sequence=
, merely changing the iterator that produce it <i>must not</i> affect that =
reference.</div><div><br></div><div>`filesystem::path::iterator` doesn'=
t work that way. Indeed, <a href=3D"http://eel.is/c++draft/fs.path.itr">the=
standard specifically calls this out </a>when it claims that it is a Bidir=
ectionalIterator.</div><div><br></div><div>Now, why does `path::iterator` n=
ot provide multipass? This is because it is an attempt to fix the problem o=
f proxy iterator by creating a different type of iterator: a so-called &quo=
t;stashing" iterator. The iterator actually stores a value_type within=
itself, so that `operator*` can return a proper reference (returning value=
s from `operator*` is what makes an iterator a proxy iterator). But by doin=
g so, you break the multipass requirements of ForwardIterator.</div><div><b=
r></div><div>Because of the long-standing issue with proxy iterators, there=
are actually a fair number of stashing iterators in the standard: `path::i=
terator`, `regex_[token]_iterator`, and probably others. The stream iterato=
rs dodge this issue because they are only Input/OutputIterators, and those =
have no expectation of multipass.</div><div><br></div><div>Since proxy iter=
ators are now legal under the Ranges Concepts, it would make sense to conve=
rt these into proxy iterators: their `reference` type is the same as their =
`value_type`. Of course, that is a breaking change; anyone who ever did `au=
to &val =3D *it;` will have their code broken. The Ranges idiom is to u=
se `auto &&` to capture a (possibly modifiable) reference to an ele=
ment.</div><div><br></div><div>Now, we can easily add proper Range-based it=
erators for regexes (or more specifically, add a proper range type
for regex iteration that uses the iterator/sentinel paradigm). But `path::i=
terator` is a=20
part of `path`'s natural range interface. I'm not sure it's a g=
ood idea=20
to mess with that.</div><div><br></div><div>At the same time, because the m=
ultipass requirement of the ForwardIterator range Concept is not something =
which can be detected by the concept requirements, there is no way for a pi=
ece of code to recognize that `path::iterator` isn't actually a Forward=
Iterator. This makes it somewhat fragile to use.</div><div><br></div><div>T=
hen again, since `path::iterator` never really fulfilled the BidirectionalI=
terator requirements, maybe it was always fragile to use. That would mean f=
ixing it would be resolving a defect.</div><div><br></div><div>Any ideas?<b=
r></div></div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/b74f25d2-4852-40cc-8c5a-b6a2359ad1d0%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/b74f25d2-4852-40cc-8c5a-b6a2359ad1d0=
%40isocpp.org</a>.<br />
------=_Part_49457_986095933.1530112183460--
------=_Part_49456_1940335030.1530112183460--
.