Topic: Using a weaker comparison function with the
Author: Paul Smith <pl.smith.mail@gmail.com>
Date: Fri, 21 Dec 2012 18:57:11 +0200
Raw View
This is an (informal) proposal to add versions of the associative
containers' search operations that use a different comparison function
than the one used by the container.
The associative containers' search operations, namely `find`, `count`,
`lower_bound`, `upper_bound` and `equal_range` all take a search key
of the same type as the container's key, and use the container's
comparison function ([associative.reqmts]). This is in contrast to the
corresponding binary search algorithms, namely `lower_bound`,
`upper_bound`, `equal_range` and `binary_search`, which take a search
key of arbitrary type, and any comparison function that, together with
the search key, induces a valid partition of the range
([alg.binary.search]). The use of such hetrogenous comparison
functions with these algorithms is discussed in LWG issue #270 [1] and
N1313 [2], among others.
In the above references, the main rationale for supporting hetrogenous
comparison is avoiding the need to construct a superflous object when
searching a range of key/value objects, where the key is the only real
comparison parameter. Admittedly, that's not too relevant to the
associative containers - that's what we have `[multi]map` for.
However, there is another reason why we might want to search an
associative container with a different comparsion function: we might
want to use one that induces a weaker ordering, that is, one that
generates coarser equivalence classes. For example, say I have a set
of strings (with the usual lexicographical order). I want to find all
the elements that begin with a certain substring. With the current
interface there's no clean way to do this [3]. Yet, it would have been
easily possible if I could pass an appropriate comparison function to
`equal_range`. Oh sure, I could use the generic algorithm, but since
we're dealing with non random-access iterators, it's orders of
magnitude slower than the container's version (tested with libstdc++
and Dinkumware).
I propose adding versions of the associative containers' search
operations that accept a generic key argument and a hetrogenous
comparison function, and use them to search the container, subject to
similar conditions to those of the binary search algorithms, namely
that [container.begin(), container.end()) is partitioned w.r.t. the
comparison function and the key. Perhaps it's more useful to state
this requirement in terms of the container's comparison function
instead: if `f` is the function formed by binding the search key to
one of the comparison function's (the one passed to the search
function) arguments and `comp` is the container's comparison function,
then the following must hold: `f(x) && !comp(x, y) ==> f(y)`. (In
english: if `x` is before the point of partition and `y <= x` then `y`
is also before the point of partition).
I also propose adding versions of these functions that take a generic
key argument, without a comparison function. In this case, the
container's comparison function is used. This way we can use an
overloaded comparison object that acts both as a strict weak order and
as a set of hetrogenous overloads, and spare the additional argument
to the search functions. It also feels more in-sync with the generic
binary search algorithms.
As for implementability - it's pretty much just a matter of replacing
the use of the container's comparison function with the one passed to
the search function (unless the implementation is very unusual). I
added these overloads to my libstdc++ headers without a problem. I
skimmed through the Dinkumware implementation and it looks about the
same.
Thoughts? Comments?
[1] http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#270
[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2001/n1313.html
[3] Even if you can come up with some dodgy way to do this using
`lower_bound` and `upper_bound`, that's just missing the point.
Suppose I replaced `string` with a lexicographically-ordered
`vector<C>`, where `C` is a type that has a notion of order, but not
of preceding and succeeding values.
--
Paul Smith
--
.
Author: =?ISO-8859-1?Q?Daniel_Kr=FCgler?= <daniel.kruegler@gmail.com>
Date: Fri, 21 Dec 2012 17:59:53 +0100
Raw View
2012/12/21 Paul Smith <pl.smith.mail@gmail.com>:
> This is an (informal) proposal to add versions of the associative
> containers' search operations that use a different comparison function
> than the one used by the container.
Just a very quick response (and my apologize for not having full read
your contribution).
Can you point out significant differences to
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3465.pdf
?
Thanks,
- Daniel
>
> The associative containers' search operations, namely `find`, `count`,
> `lower_bound`, `upper_bound` and `equal_range` all take a search key
> of the same type as the container's key, and use the container's
> comparison function ([associative.reqmts]). This is in contrast to the
> corresponding binary search algorithms, namely `lower_bound`,
> `upper_bound`, `equal_range` and `binary_search`, which take a search
> key of arbitrary type, and any comparison function that, together with
> the search key, induces a valid partition of the range
> ([alg.binary.search]). The use of such hetrogenous comparison
> functions with these algorithms is discussed in LWG issue #270 [1] and
> N1313 [2], among others.
>
> In the above references, the main rationale for supporting hetrogenous
> comparison is avoiding the need to construct a superflous object when
> searching a range of key/value objects, where the key is the only real
> comparison parameter. Admittedly, that's not too relevant to the
> associative containers - that's what we have `[multi]map` for.
>
> However, there is another reason why we might want to search an
> associative container with a different comparsion function: we might
> want to use one that induces a weaker ordering, that is, one that
> generates coarser equivalence classes. For example, say I have a set
> of strings (with the usual lexicographical order). I want to find all
> the elements that begin with a certain substring. With the current
> interface there's no clean way to do this [3]. Yet, it would have been
> easily possible if I could pass an appropriate comparison function to
> `equal_range`. Oh sure, I could use the generic algorithm, but since
> we're dealing with non random-access iterators, it's orders of
> magnitude slower than the container's version (tested with libstdc++
> and Dinkumware).
>
> I propose adding versions of the associative containers' search
> operations that accept a generic key argument and a hetrogenous
> comparison function, and use them to search the container, subject to
> similar conditions to those of the binary search algorithms, namely
> that [container.begin(), container.end()) is partitioned w.r.t. the
> comparison function and the key. Perhaps it's more useful to state
> this requirement in terms of the container's comparison function
> instead: if `f` is the function formed by binding the search key to
> one of the comparison function's (the one passed to the search
> function) arguments and `comp` is the container's comparison function,
> then the following must hold: `f(x) && !comp(x, y) ==> f(y)`. (In
> english: if `x` is before the point of partition and `y <= x` then `y`
> is also before the point of partition).
>
> I also propose adding versions of these functions that take a generic
> key argument, without a comparison function. In this case, the
> container's comparison function is used. This way we can use an
> overloaded comparison object that acts both as a strict weak order and
> as a set of hetrogenous overloads, and spare the additional argument
> to the search functions. It also feels more in-sync with the generic
> binary search algorithms.
>
> As for implementability - it's pretty much just a matter of replacing
> the use of the container's comparison function with the one passed to
> the search function (unless the implementation is very unusual). I
> added these overloads to my libstdc++ headers without a problem. I
> skimmed through the Dinkumware implementation and it looks about the
> same.
>
>
> Thoughts? Comments?
>
>
> [1] http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#270
> [2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2001/n1313.html
>
> [3] Even if you can come up with some dodgy way to do this using
> `lower_bound` and `upper_bound`, that's just missing the point.
> Suppose I replaced `string` with a lexicographically-ordered
> `vector<C>`, where `C` is a type that has a notion of order, but not
> of preceding and succeeding values.
>
> --
> Paul Smith
>
> --
>
>
>
--
.
Author: Paul Smith <pl.smith.mail@gmail.com>
Date: Fri, 21 Dec 2012 09:51:09 -0800 (PST)
Raw View
------=_Part_375_25162656.1356112269910
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Friday, December 21, 2012 6:59:53 PM UTC+2, Daniel Kr=FCgler wrote:
>
> 2012/12/21 Paul Smith <pl.smi...@gmail.com <javascript:>>:=20
> > This is an (informal) proposal to add versions of the associative=20
> > containers' search operations that use a different comparison function=
=20
> > than the one used by the container.=20
>
> Just a very quick response (and my apologize for not having full read=20
> your contribution).=20
> Can you point out significant differences to=20
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3465.pdf=20
>
> ?=20
>
>
Oh, this looks like exactly the same thing! I completely missed it
somehow. Would have saved me some typing :)
Please add a +1 from me when this proposal is being discussed.
My only concern is with the suggestion to completely replace the
existing overloads that take a `const key_type&` with the generic
ones. I was thinking about it too, and it definitely fits the bill
in most cases, but I'm slightly worried about currently valid code
where the type of the generic key could not be deduced. For
example, when using an initializer-list as the argument. Perhaps
it's better to keep the current overloads around, even at the cost of
slightly increasing the surface area.=20
--
Paul Smith
--=20
------=_Part_375_25162656.1356112269910
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Friday, December 21, 2012 6:59:53 PM UTC+2, Daniel Kr=FCgler wro=
te:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;=
border-left: 1px #ccc solid;padding-left: 1ex;">2012/12/21 Paul Smith <<=
a href=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=3D"jWtuVw4jP=
xgJ">pl.smi...@gmail.com</a>>:
<br>> This is an (informal) proposal to add versions of the associative
<br>> containers' search operations that use a different comparison func=
tion
<br>> than the one used by the container.
<br>
<br>Just a very quick response (and my apologize for not having full read
<br>your contribution).
<br>Can you point out significant differences to
<br>
<br><a href=3D"http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n346=
5.pdf" target=3D"_blank">http://www.open-std.org/jtc1/<wbr>sc22/wg21/docs/p=
apers/2012/<wbr>n3465.pdf</a>
<br>
<br>?
<br><br></blockquote><div><br></div><div>Oh, this looks like exactly the sa=
me thing! I completely missed it</div><div>somehow. Would have saved me som=
e typing :)</div><div>Please add a +1 from me when this proposal is being d=
iscussed.</div><div><br></div><div><br></div><div>My only concern is with t=
he suggestion to completely replace the</div><div>existing overloads that t=
ake a `const key_type&` with the generic</div><div>ones. I was thinking=
about it too, and it definitely fits the bill</div><div>in most cases, but=
I'm slightly worried about currently valid code</div><div>where the&=
nbsp;type of the generic key could not be deduced. For</div><div>example, w=
hen using an initializer-list as the argument. Perhaps</div><div>it's&=
nbsp;better to keep the current overloads around, even at the cost of<=
/div><div>slightly increasing the surface area. </div><div><br></=
div><div>--</div><div>Paul Smith</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_375_25162656.1356112269910--
.
Author: Nikolay Ivchenkov <tsoae@mail.ru>
Date: Tue, 25 Dec 2012 03:53:02 -0800 (PST)
Raw View
------=_Part_272_27903125.1356436382663
Content-Type: text/plain; charset=ISO-8859-1
On Friday, December 21, 2012 9:51:09 PM UTC+4, Paul Smith wrote:
> My only concern is with the suggestion to completely replace the
> existing overloads that take a `const key_type&` with the generic
> ones. I was thinking about it too, and it definitely fits the bill
> in most cases, but I'm slightly worried about currently valid code
> where the type of the generic key could not be deduced. For
> example, when using an initializer-list as the argument.
>
Such changes may also slow down existing programs. For example, under the
current specification, here
struct X;
struct Y
{
Y(X const &);
};
void f(std::map<Y, std::string> &m, X const &x)
{
auto i = m.find(x);
g(m, i);
}
value of type X is converted to type Y only once. With the proposed
interface such a conversion would take place at every call of the
comparison function. I think that "generic" versions should have different
names - e.g.
erase - erase_for
find - find_for
count - count_for
lower_bound - lower_bound_for
upper_bound - upper_bound_for
equal_range - equal_range_for
operator[] - mapped_for
at - mapped_at
--
------=_Part_272_27903125.1356436382663
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Friday, December 21, 2012 9:51:09 PM UTC+4, Paul Smith wrote:<br><blockq=
uote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-lef=
t: 1px #ccc solid;padding-left: 1ex;"><div>My only concern is with the sugg=
estion to completely replace the</div><div>existing overloads that take a `=
const key_type&` with the generic</div><div>ones. I was thinking about =
it too, and it definitely fits the bill</div><div>in most cases, but I'm sl=
ightly worried about currently valid code</div><div>where the ty=
pe of the generic key could not be deduced. For</div><div>example, when&nbs=
p;using an initializer-list as the argument.</div></blockquote><div><br>Suc=
h changes may also slow down existing programs. For example, under the curr=
ent specification, here<br><br> struct X;<br><br> &n=
bsp; struct Y<br> {<br> &nbs=
p; Y(X const &);<br> };<br><br> &nbs=
p; void f(std::map<Y, std::string> &m, X const &x)<br>&=
nbsp; {<br> auto i =
=3D m.find(x);<br> g(m, i);<br>&n=
bsp; }<br><br>value of type X is converted to type Y only once.=
With the proposed interface such a conversion would take place at every ca=
ll of the comparison function. I think that "generic" versions should have =
different names - e.g.<br><br>erase - erase_for<br>find - find_for<br>count=
- count_for<br>lower_bound - lower_bound_for<br>upper_bound - upper_bound_=
for<br>equal_range - equal_range_for<br>operator[] - mapped_for<br>at - map=
ped_at<br><br></div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_272_27903125.1356436382663--
.
Author: Paul Smith <pl.smith.mail@gmail.com>
Date: Tue, 25 Dec 2012 06:28:42 -0800 (PST)
Raw View
------=_Part_744_24639726.1356445722175
Content-Type: text/plain; charset=ISO-8859-1
On Tuesday, December 25, 2012 1:53:02 PM UTC+2, Nikolay Ivchenkov wrote:
>
> On Friday, December 21, 2012 9:51:09 PM UTC+4, Paul Smith wrote:
>
>> My only concern is with the suggestion to completely replace the
>> existing overloads that take a `const key_type&` with the generic
>> ones. I was thinking about it too, and it definitely fits the bill
>> in most cases, but I'm slightly worried about currently valid code
>> where the type of the generic key could not be deduced. For
>> example, when using an initializer-list as the argument.
>>
>
> Such changes may also slow down existing programs. For example, under the
> current specification, here
>
> struct X;
>
> struct Y
> {
> Y(X const &);
> };
>
> void f(std::map<Y, std::string> &m, X const &x)
> {
> auto i = m.find(x);
> g(m, i);
> }
>
> value of type X is converted to type Y only once. With the proposed
> interface such a conversion would take place at every call of the
> comparison function. I think that "generic" versions should have different
> names - e.g.
>
> erase - erase_for
> find - find_for
> count - count_for
> lower_bound - lower_bound_for
> upper_bound - upper_bound_for
> equal_range - equal_range_for
> operator[] - mapped_for
> at - mapped_at
>
>
Ha! Just yesterday I mailed Joaquin (the author of N3465) about the
same issue exactly ;)
The poster child for me was searching a set of (std::)strings using a
'const char*'. Note that the proposal mentions the possibility of
replacing the monomorphic 'std::less' with a polymorphic version, in
which case the same code actually benefits, by eliminating the string
temporary completely. So, there are two sides to this coin.
Sadly, I don't see an obvious solution to this (I don't like the idea
of using different names for these functions too much, but then again,
neither do I like any other solution I can think of). I'm not sure how
serious this issue really is in practice. I don't want to throw around
unsubstantiated assessments. I do believe (or at least hope) that it's
not a show-stopper, though.
--
Paul Smith
--
------=_Part_744_24639726.1356445722175
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Tuesday, December 25, 2012 1:53:02 PM UTC+2, Nikolay Ivchenkov w=
rote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;">On Friday, December 21, 2=
012 9:51:09 PM UTC+4, Paul Smith wrote:<br><blockquote class=3D"gmail_quote=
" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-le=
ft:1ex"><div>My only concern is with the suggestion to completely replace t=
he</div><div>existing overloads that take a `const key_type&` with the =
generic</div><div>ones. I was thinking about it too, and it definitely fits=
the bill</div><div>in most cases, but I'm slightly worried about cur=
rently valid code</div><div>where the type of the generic key could no=
t be deduced. For</div><div>example, when using an initializer-list as=
the argument.</div></blockquote><div><br>Such changes may also slow down e=
xisting programs. For example, under the current specification, here<br><br=
> struct X;<br><br> struct Y<br> &=
nbsp; {<br> Y(X const &=
);<br> };<br><br> void f(std::map<Y,=
std::string> &m, X const &x)<br> {<br> &=
nbsp; auto i =3D m.find(x);<br> &n=
bsp; g(m, i);<br> }<br><br>value =
of type X is converted to type Y only once. With the proposed interface suc=
h a conversion would take place at every call of the comparison function. I=
think that "generic" versions should have different names - e.g.<br><br>er=
ase - erase_for<br>find - find_for<br>count - count_for<br>lower_bound - lo=
wer_bound_for<br>upper_bound - upper_bound_for<br>equal_range - equal_range=
_for<br>operator[] - mapped_for<br>at - mapped_at<br><br></div></blockquote=
><div><br></div>Ha! Just yesterday I mailed Joaquin (the author of N3465) a=
bout the<br>same issue exactly ;)<br>The poster child for me was searching =
a set of (std::)strings using a<br>'const char*'. Note that the proposal me=
ntions the possibility of<br>replacing the monomorphic 'std::less' with a p=
olymorphic version, in<br>which case the same code actually benefits, by el=
iminating the string<br>temporary completely. So, there are two sides to th=
is coin.<br><br>Sadly, I don't see an obvious solution to this (I don't lik=
e the idea<br>of using different names for these functions too much, but th=
en again,<br>neither do I like any other solution I can think of). I'm not =
sure how<br>serious this issue really is in practice. I don't want to throw=
around<br>unsubstantiated assessments. I do believe (or at least hope) tha=
t it's<br>not a show-stopper, though.<br><div><br></div><div>--</div><div>P=
aul Smith</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_744_24639726.1356445722175--
.
Author: Nikolay Ivchenkov <tsoae@mail.ru>
Date: Tue, 25 Dec 2012 07:38:40 -0800 (PST)
Raw View
------=_Part_1_437542.1356449920494
Content-Type: text/plain; charset=ISO-8859-1
On Tuesday, December 25, 2012 6:28:42 PM UTC+4, Paul Smith wrote:
> The poster child for me was searching a set of (std::)strings using a
> 'const char*'. Note that the proposal mentions the possibility of
> replacing the monomorphic 'std::less' with a polymorphic version, in
> which case the same code actually benefits, by eliminating the string
> temporary completely. So, there are two sides to this coin.
>
I was going to write a proposal about find_for, count_for, etc, where
comparison between std::string and char const * would be a main motivating
example. It's too late to do it now due to presence of N3465 :-)
> Sadly, I don't see an obvious solution to this (I don't like the idea
> of using different names for these functions too much, but then again,
> neither do I like any other solution I can think of).
IMO, name separation is the best solution here. No ambiguities, no breaking
changes. This approach is already applied to push_back and emplace_back,
for example.
--
------=_Part_1_437542.1356449920494
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Tuesday, December 25, 2012 6:28:42 PM UTC+4, Paul Smith wrote:<br><block=
quote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-le=
ft: 1px #ccc solid;padding-left: 1ex;">The poster child for me was searchin=
g a set of (std::)strings using a<br>'const char*'. Note that the proposal =
mentions the possibility of<br>replacing the monomorphic 'std::less' with a=
polymorphic version, in<br>which case the same code actually benefits, by =
eliminating the string<br>temporary completely. So, there are two sides to =
this coin.<br></blockquote><div><br>I was going to write a proposal about f=
ind_for, count_for, etc, where=20
comparison between std::string and char const * would be a main=20
motivating example. It's too late to do it now due to presence of N3465 :-)=
<br> <br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0;ma=
rgin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Sadly, I d=
on't see an obvious solution to this (I don't like the idea<br>of using dif=
ferent names for these functions too much, but then again,<br>neither do I =
like any other solution I can think of).</blockquote><div><br>IMO, name sep=
aration is the best solution here. No ambiguities, no breaking changes. Thi=
s approach is already applied to push_back and emplace_back, for example.</=
div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_1_437542.1356449920494--
.
Author: Paul Smith <pl.smith.mail@gmail.com>
Date: Tue, 25 Dec 2012 09:29:17 -0800 (PST)
Raw View
------=_Part_3_20053284.1356456557503
Content-Type: text/plain; charset=ISO-8859-1
On Tuesday, December 25, 2012 5:38:40 PM UTC+2, Nikolay Ivchenkov wrote:
>
> On Tuesday, December 25, 2012 6:28:42 PM UTC+4, Paul Smith wrote:
>
>> Sadly, I don't see an obvious solution to this (I don't like the idea
>>
> of using different names for these functions too much, but then again,
>> neither do I like any other solution I can think of).
>
>
> IMO, name separation is the best solution here. No ambiguities, no
> breaking changes. This approach is already applied to push_back and
> emplace_back, for example.
>
I think that at least part of the reason why 'emplace_*' got a distinct name
was more "philosophical" than technical, so it might not directly apply
here.
That's a good point, nonetheless. The LWG definitely made the most
conservative choice in that case, so your suggestion might very well be the
most appropriate one.
--
Paul Smith
--
------=_Part_3_20053284.1356456557503
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Tuesday, December 25, 2012 5:38:40 PM UTC+2, Nikolay Ivchenkov w=
rote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;">On Tuesday, December 25, =
2012 6:28:42 PM UTC+4, Paul Smith wrote:<br><blockquote class=3D"gmail_quot=
e" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-l=
eft:1ex">Sadly, I don't see an obvious solution to this (I don't like the i=
dea<br></blockquote><blockquote class=3D"gmail_quote" style=3D"margin:0;mar=
gin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">of using differ=
ent names for these functions too much, but then again,<br>neither do I lik=
e any other solution I can think of).</blockquote><div><br>IMO, name separa=
tion is the best solution here. No ambiguities, no breaking changes. This a=
pproach is already applied to push_back and emplace_back, for example.</div=
></blockquote><div><br></div><div><br></div><div>I think that at least part=
of the reason why 'emplace_*' got a distinct name</div><div>was more "phil=
osophical" than technical, so it might not directly apply here.</div><div>T=
hat's a good point, nonetheless. The LWG definitely made the most</div><div=
>conservative choice in that case, so your suggestion might very well be th=
e</div><div>most appropriate one.</div><div><br></div><div>--</div><div>Pau=
l Smith</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_3_20053284.1356456557503--
.
Author: Nikolay Ivchenkov <tsoae@mail.ru>
Date: Tue, 25 Dec 2012 11:50:27 -0800 (PST)
Raw View
------=_Part_474_11225213.1356465027994
Content-Type: text/plain; charset=ISO-8859-1
On Tuesday, December 25, 2012 9:29:17 PM UTC+4, Paul Smith wrote:
>
> I think that at least part of the reason why 'emplace_*' got a distinct
> name
> was more "philosophical" than technical, so it might not directly apply
> here.
>
There are at least 6 basic cases where "perfect forwarding" fails:
#define FORWARD(ref) static_cast<decltype(ref)>(ref)
template <class T>
struct X
{
static void g(T) {}
template <class U>
static void f(U &&x)
{ g(FORWARD(x)); }
};
1) Passing a null-pointer constant of an integer type:
int main()
{
X<int *>::g(0); // OK
X<int *>::f(0); // error
}
2) Passing a function template or a set of overloaded functions/function
templates
template <class T>
void h(T) {}
int main()
{
X<void(int)>::g(h); // OK
X<void(int)>::f(h); // error
}
3) Passing a non-const bit-field lvalue:
struct A
{
signed bitfield : 2;
} a;
int main()
{
X<int>::g(a.bitfield); // OK
X<int>::f(a.bitfield); // error
}
4) Passing an expression whose type includes "pointer to array of unknown
bound of T" or "reference to array of unknown bound of T":
extern int arr[];
int (&ref)[] = arr;
int arr[10];
int main()
{
X<int *>::g(ref); // OK
X<int *>::f(ref); // error
}
5) Passing an undefined object:
struct A
{
static int const undefined = 1;
};
int main()
{
X<int>::g(A::undefined); // OK
X<int>::f(A::undefined); // ODR violation
}
6) Passing a braced-init-list:
int main()
{
X<int>::g({}); // OK
X<int>::f({}); // error
}
If we remove push_back and rename emplace_back to push_back, then valid
C++03 code may fail to compile or silently change the behavior (see cases 1
- 4).
--
------=_Part_474_11225213.1356465027994
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Tuesday, December 25, 2012 9:29:17 PM UTC+4, Paul Smith wrote:<blockquot=
e class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: =
1px #ccc solid;padding-left: 1ex;">I think that at least part of the reason=
why 'emplace_*' got a distinct name<div>was more "philosophical" than tech=
nical, so it might not directly apply here.</div></blockquote><div><br>Ther=
e are at least 6 basic cases where "perfect forwarding" fails:<br><br> =
; #define FORWARD(ref) static_cast<decltype(ref)>(ref)<br=
><br> template <class T><br>  =
; struct X<br> {<br> &=
nbsp; static void g(T) {}<br><br> =
template <class U><br> &nbs=
p; static void f(U &&x)<br>&nbs=
p; &=
nbsp; { g(FORWARD(x)); }<br> };<br><br>1) Passing a=
null-pointer constant of an integer type:<br><br> int ma=
in()<br> {<br> =
X<int *>::g(0); // OK<br> X=
<int *>::f(0); // error<br> }<br><br>2) Passing a f=
unction template or a set of overloaded functions/function templates<br><br=
> template <class T><br> &nb=
sp; void h(T) {}<br><br> int main()<br> =
{<br> X<void(int)=
>::g(h); // OK<br> X<void(i=
nt)>::f(h); // error<br> }<br><br>3) Passing a non-con=
st bit-field lvalue:<br><br> struct A<br> &nbs=
p; {<br> signed bitfield : 2;<br>=
} a;<br><br> int main()<br>  =
; {<br> X<int>::g(a.b=
itfield); // OK<br> X<int>:=
:f(a.bitfield); // error<br> }<br><br>4) Passing an expre=
ssion whose type includes "pointer to array of unknown bound of T" or "refe=
rence to array of unknown bound of T":<br><br> extern int=
arr[];<br> int (&ref)[] =3D arr;<br> &nbs=
p; int arr[10];<br><br> int main()<br> =
{<br> X<int *>::g(ref); // =
OK<br> X<int *>::f(ref); //=
error<br> }<br><br>5) Passing an undefined object:<br><b=
r> struct A<br> {<br> =
static int const undefined =3D 1;<br> &=
nbsp; };<br><br> int main()<br> {<br>&n=
bsp; X<int>::g(A::undefined); // =
OK<br> X<int>::f(A::undefin=
ed); // ODR violation<br> }<br><br>6) Passing a braced-in=
it-list:<br><br> int main()<br> {<br>&n=
bsp; X<int>::g({}); // OK<br>&nbs=
p; X<int>::f({}); // error<br>&nb=
sp; } <br><br>If we remove push_back and rename emplace_back to=
push_back, then valid C++03 code may fail to compile or silently change th=
e behavior (see cases 1 - 4).</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_474_11225213.1356465027994--
.
Author: Paul Smith <pl.smith.mail@gmail.com>
Date: Wed, 26 Dec 2012 09:17:59 -0800 (PST)
Raw View
------=_Part_339_23353427.1356542279114
Content-Type: text/plain; charset=ISO-8859-1
On Tuesday, December 25, 2012 9:50:27 PM UTC+2, Nikolay Ivchenkov wrote:
>
> On Tuesday, December 25, 2012 9:29:17 PM UTC+4, Paul Smith wrote:
>>
>> I think that at least part of the reason why 'emplace_*' got a distinct
>> name
>> was more "philosophical" than technical, so it might not directly apply
>> here.
>>
>
> There are at least 6 basic cases where "perfect forwarding" fails:
>
> [snip cases]
>
> If we remove push_back and rename emplace_back to push_back, then valid
> C++03 code may fail to compile or silently change the behavior (see cases 1
> - 4).
>
Actually, I was referring to the fact that (from my reading of the emplace
proposal
I get the impression that) there was (also) a conceptual/pedagogical motive
behind
the decision to use a different name, and that this motive doesn't
necessarily apply
here.
--
Paul Smith
--
------=_Part_339_23353427.1356542279114
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Tuesday, December 25, 2012 9:50:27 PM UTC+2, Nikolay Ivchenkov w=
rote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8e=
x;border-left: 1px #ccc solid;padding-left: 1ex;">On Tuesday, December 25, =
2012 9:29:17 PM UTC+4, Paul Smith wrote:<blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:=
1ex">I think that at least part of the reason why 'emplace_*' got a distinc=
t name<div>was more "philosophical" than technical, so it might not directl=
y apply here.</div></blockquote><div><br>There are at least 6 basic cases w=
here "perfect forwarding" fails:<br><br>[snip cases]<br><br>If we remove pu=
sh_back and rename emplace_back to push_back, then valid C++03 code may fai=
l to compile or silently change the behavior (see cases 1 - 4).</div></bloc=
kquote><div><br></div><div><br></div><div>Actually, I was referring to the =
fact that (from my reading of the emplace proposal</div><div>I get the impr=
ession that) there was (also) a conceptual/pedagogical motive behind</div><=
div>the decision to use a different name, and that this motive doesn't=
necessarily apply</div><div>here. </div><div><br></div><div>--</div><=
div>Paul Smith</div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_339_23353427.1356542279114--
.
Author: joaquinlopezmunoz@gmail.com
Date: Sun, 6 Jan 2013 10:07:59 -0800 (PST)
Raw View
------=_Part_172_1050703.1357495679035
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Tuesday, December 25, 2012 6:29:17 PM UTC+1, Paul Smith wrote:
>
>
>
> On Tuesday, December 25, 2012 5:38:40 PM UTC+2, Nikolay Ivchenkov wrote:
>>
>> On Tuesday, December 25, 2012 6:28:42 PM UTC+4, Paul Smith wrote:
>>
>>> Sadly, I don't see an obvious solution to this (I don't like the idea
>>>
>> of using different names for these functions too much, but then again,
>>> neither do I like any other solution I can think of).
>>
>>
>> IMO, name separation is the best solution here. No ambiguities, no=20
>> breaking changes. This approach is already applied to push_back and=20
>> emplace_back, for example.
>>
>
>
> I think that at least part of the reason why 'emplace_*' got a distinct=
=20
> name
> was more "philosophical" than technical, so it might not directly apply=
=20
> here.
> That's a good point, nonetheless. The LWG definitely made the most
> conservative choice in that case, so your suggestion might very well be t=
he
> most appropriate one.
>
Considering that N3241 (to the best of my knowledge) has been approved:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3421.htm=20
we might consider having associative containers take less<> as their=20
default comaprison predicate:
template <class Key, class Compare =3D less<>, class Allocator =3D=20
allocator<Key>>=20
class set;=20
so that the const char* / string problem immediately goes away (and results=
=20
automatically in better performance for legacy code using set<string> and=
=20
such.) I don't know whether the perfect forwarding pointed out by Nikolay=
=20
in connection with emplace_back apply here.
Joaqu=EDn M=AA L=F3pez Mu=F1oz
Telef=F3nica Digital
--=20
------=_Part_172_1050703.1357495679035
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
<br><br>On Tuesday, December 25, 2012 6:29:17 PM UTC+1, Paul Smith wrote:<b=
lockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;borde=
r-left: 1px #ccc solid;padding-left: 1ex;"><br><br>On Tuesday, December 25,=
2012 5:38:40 PM UTC+2, Nikolay Ivchenkov wrote:<blockquote class=3D"gmail_=
quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;paddi=
ng-left:1ex">On Tuesday, December 25, 2012 6:28:42 PM UTC+4, Paul Smith wro=
te:<br><blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8e=
x;border-left:1px #ccc solid;padding-left:1ex">Sadly, I don't see an obviou=
s solution to this (I don't like the idea<br></blockquote><blockquote class=
=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc s=
olid;padding-left:1ex">of using different names for these functions too muc=
h, but then again,<br>neither do I like any other solution I can think of).=
</blockquote><div><br>IMO, name separation is the best solution here. No am=
biguities, no breaking changes. This approach is already applied to push_ba=
ck and emplace_back, for example.</div></blockquote><div><br></div><div><br=
></div><div>I think that at least part of the reason why 'emplace_*' got a =
distinct name</div><div>was more "philosophical" than technical, so it migh=
t not directly apply here.</div><div>That's a good point, nonetheless. The =
LWG definitely made the most</div><div>conservative choice in that case, so=
your suggestion might very well be the</div><div>most appropriate one.</di=
v></blockquote><div><br> Considering that N3241 (to the best of my kno=
wledge) has been approved:<br><br><br><a class=3D"moz-txt-link-freetext" hr=
ef=3D"http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3421.htm">ht=
tp://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3421.htm</a>
<br><br>we might consider having associative containers take less<> a=
s their default comaprison predicate:<br><br>template <class Key, class =
Compare =3D less<>, class Allocator =3D allocator<Key>>
<br>class set;
<br><br>so that the const char* / string problem immediately goes away (and=
=20
results automatically in better performance for legacy code using=20
set<string> and such.) I don't know whether the perfect forwarding po=
inted out by Nikolay in connection with emplace_back apply here.<br><br>Joa=
qu=EDn
M=AA L=F3pez Mu=F1oz<br>Telef=F3nica Digital<br></div>
<p></p>
-- <br />
<br />
<br />
<br />
------=_Part_172_1050703.1357495679035--
.
Author: Paul Smith <pl.smith.mail@gmail.com>
Date: Mon, 7 Jan 2013 21:14:44 +0200
Raw View
On Sun, Jan 6, 2013 at 8:07 PM, <joaquinlopezmunoz@gmail.com> wrote:
>
> Considering that N3241 (to the best of my knowledge) has been approved:
>
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3421.htm
>
> we might consider having associative containers take less<> as their defa=
ult
> comaprison predicate:
>
> template <class Key, class Compare =3D less<>, class Allocator =3D
> allocator<Key>>
> class set;
>
> so that the const char* / string problem immediately goes away (and resul=
ts
> automatically in better performance for legacy code using set<string> and
> such.)
It's good to know that N3241 is in (no surprises here), it definitely
paves the way towards making std::less<> the default. Note that N3465
is probably the only real impetus to actually make this change ATM,
and given the discussed issues, it's very desirable to make
std::less<> the default in conjuction with it. So, perhaps the
importance of this should be stressed a little more - maybe even
provide the necessary wording (it's such a trivial change after all).
But let's not pretend that this solves the problem completely. I agree
that in most cases users don't specify a custom comparator (so they
get an automatic "upgrade"), and in those cases, I'd like to belive
that a polymorphic std::less<> does solve the issue (but it need not
necessarily be so). Yet, I'm sure one or two people out there sort
their container using std::greater<key_type>, or
case_insensitive_string_compare, etc'... and these are far from being
corner cases.
An interesting question is: do we consider a silent performance
degradation (but otherwise identical semantics) a better thing or
worse than a hard break? I mean, legacy code keeps working, and people
can gradually adapt over time. What I dislike the most about the
alternative solutions is that they inherently make new code feel like
a second-class citizen. That's not my idea of moving forward (in
utopia-land :).
To be more constructive, here's a brief survey of possible options
(assuming we keep the existing overloads and use std::less<>):
1. Add the new overloads as suggested.
Pros: Most elegant. Straight-forward. In some cases can benefit existing co=
de.
Cons: In some cases, can break or degrade the performance of existing code.
Conservative options:
2. Only add the two-argument version. If the user wants to use
hetrogenous comparison, they must provide an explicit comparison
function (which can be 'container.key_comp()').
Pros: Most straightforward option. Smallest surface area. Doesn't
affect existing code. Doesn't pollute the class name-space.
Cons: Redundant and verbose when using the container's comparison
function. Existing code can't benefit.
3. Use a different name for the single-argument version, and possibly
for the two-argument version too.
Pros: Straightforward. Not as verbose or redundant as option #2.
Doesn't affect existing code.
Cons: Name separation for purely technical reasons. Pollutes the class
name-space (we're talking at least 5 different functions here).
Existing code can't benefit.
4. Use some sort of tagging for hetrogenous lookup.
The emplace proposal suggests that at some point an alternative
approach was considered. Instead of 'v.emplace_back(x, y, z)' use:
'v.push_back(emplace(x, y, z))'. Likewise, we can use:
's.find(hetrogenous(key))'. It looks like at that time people were
concerned about the implementability of this in an efficient manner,
but note that our case is much more simple: it doesn't involve
variadics or argument forwarding. All we need is a simple
reference-wrapper, no bells and whistles. There are other ways to
acheive this too, for example: 's.find(hetrogenous, key)'.
Pros: More uniform than #3. Not redundant like #2. Doesn't pollute the
class name-space. Doesn't affect existing code.
Cons: Not as straightforward as the rest. Verbose. Users who 'use
namespace std;' will be tempted to use an unqualified function call to
'hetrogenous', so theoretically ADL can find an unrelated function
(long shot, though, because it will have to beat a perfect match and
to be well-formed in this context). Existing code can't benefit. Open
ended: should the two-argument overloads accept 'hetrogenous'? Should
they require it in order to invoke hetrogenous lookup?
Provocative options:
5. Come up with a condition that enables/disables the hetrogenous
overloads based on the container's key type, the lookup key type and
the comparison function type.
This can be very tricky. I doubt that it's possible to hit the perfect
key with the current language. Will probably require user assistance
in the form of a traits class or whatever, and be a best-effort guess
otherwise.
Pros: Attempts to seamlessly balance between backward compatibility
and forward thinking. Existing code might be able to benefit.
Cons: Imperfect. Intricate. Hard to specify and to comprehend. Can
open all sorts of pandora boxes (ODR violations, etc'...)
I apologize for the length of this post. It does feel a little bit
like bike-shedding, but I think that the core issues here are relevant
not only to this proposal but to future proposals that will run into
the same dilemmas as well.
> I don't know whether the perfect forwarding pointed out by Nikolay in
> connection with emplace_back apply here.
#1 is probably the only concerning case, and was also mentioned in the
emplace proposal. It occurs, for example, when someone tries to look
up 0 in a set of pointers (will become ill-formed, because the
constant-expression-ness of 0 is lost and integers can't be otherwise
converted to pointers). C++11 solves this with nullptr, though not
retroactively. As for NULL, the standard only says that it is an
implementation-defined null pointer constant. That is, it can be
nullptr, but also 0, and given the long history of NULL, I doubt that
implementers are going to define it as nullptr any time soon.
#3 is specific to 'T&&' parameters while we're dealing with 'const T&'
parameters.
As for #5, using undefined static-const members in such contexts is so
fragile, that I consider such code badly written even if it's
currently well-formed. In other words, I don't care too much about
breaking it.
The rest of the cases are deduction failures, which are easily
addressed by keeping the existing overloads.
>
> Joaqu=EDn M=AA L=F3pez Mu=F1oz
> Telef=F3nica Digital
--=20
.