Topic: Proposal: improve insertion interface of std::map and std::unordered_map
Author: tkoeppe@google.com
Date: Tue, 7 Jan 2014 13:39:44 -0800 (PST)
Raw View
------=_Part_830_17621246.1389130784151
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
I would like to propose a small addition to the standard library "map" and=
=20
"unordered_map" containers to overcome some limitiations and relax some=20
unnecessary restrictions.
Here is a small, motivating example (all in namespace std, for brevity):
*// Setup*
*map<string, unique_ptr<Foo>> m;*
*static_cast<void>(m["foo"]);*
*// The problem*
*unique_ptr<Foo> p(new Foo);*
*auto it =3D m.emplace(**"foo"**, move(p));*
Question: What is the state of *p*? Has it been moved from or not?
The answer is in fact that it depends on your library implementation.=20
Specifically, it may depend on something like whether internal helper=20
functions use deducing templates or "value_type &&" arguments -- in fact,=
=20
GCC 4.6.3 and GCC 4.8.2 give different answers (and it also differs between=
=20
ordered and unordered map).
*The problem* is that insert() may fail and not perform an insertion (if=20
the key already exists), but in that case it is *unspecified* whether=20
arguments are moved-from.
*The solution*, after deliberation and discussion, is to propose additional=
=20
interfaces. Those will turn out to have further benefits besides solving=20
the above problem, which are discussed below:
Add the following interfaces to std::map and std::unordered_map:
*pair<iterator, bool>*
*stable_insert(key_type const & k, mapped_type const & v)*
*pair<iterator, bool>*
*stable_insert(key_type const & k, mapped_type && v)*
*template <typename ...Args>*
*pair<iterator, bool>*
*stable_emplace(key_type const & k, Args &&... args)*
*Description:* If the key *k* already exists in the map, does nothing.=20
Otherwise, inserts the element constructed from the arguments into the map.=
=20
Returns [the same as insert/emplace].
*Additional guarantee:* If the key already exists, no dynamic allocations=
=20
are performed and no exceptions are thrown.
Also add the following:
*pair<iterator, bool>*
*always_insert(value_type const & x)*
*pair<iterator, bool>*
*always_insert(value_type && x)*
*template <typename ...Args>*
*pair<iterator, bool>*
*always_emplace(Args &&... args)*
Description: Inserts the element constructed from the arguments at the key=
=20
determined by that element. Returns: The iterator to the inserted element=
=20
and a boolean indicating whether the key had already existed prior to the=
=20
insertion. Note: This is an improved version of =93*m[k] =3D v;*=94 which d=
oes=20
not require unnecessary default construction and returns a more informative=
=20
result.
*A non-solution: *I would like to stress that neither of the following=20
"alternatives" works:
-=20
- m.emplace(k, move(v))
m.insert(make_pair(k, move(v))) // or similar
This may create a pair object irrespective of whether the key already=20
exists. The problem here is that this takes ownership of move-only objec=
ts=20
without guaranteeing that the acquired object is stored or destroyed=20
immediately.
=20
- m.insert(pair<K &&, T &&>(move(k), move(v))
Likewise, this may or may not create a pair object even when the key=20
already exists. This one is even more insidious, since it may look like =
the=20
caller retains ownership if the insertion does not take place, but this =
is=20
in fact not guaranteed. (The implementation may internally create a new =
pair<K=20
const, T> from the arguments before checking whether the insertion is=20
needed.)
*A bit of discussion.* Several real-world codebases include helper=20
functions to perform "unconditional insertions" ("insert-or-update") into=
=20
maps. All of them require boilerplate, some are inefficient, and none are=
=20
elegant. Here are a few possible implementations:
- m[k] =3D v; or m[k] =3D move(v);
This always default-constructs an element, even though it is immediately=
=20
assigned-to. (Actually, this *requires* default-constructbility in the=
=20
first place, which isn't even necessary otherwise.)
- auto it =3D find(k) followed by insert() or *it =3D move(v)
Not terrible, but it performs the map search (tree walk or hashing)=20
twice.
- auto it =3D lower_bound(k) followed by hinted insertion
Near-optimal for std::map, but specific to the container (i.e. does not=
=20
work for unordered maps), and few people remember this one. Moreover, th=
ere=20
is no analogue for std::unordered_map that would avoid repeated hashing.
The proposed new stable-insert/always-insert interfaces would remedy all=20
this. They're as efficient as possible. Implementation is a=20
straight-forward modification (at least as far as I can tell from=20
libstdc++). Usage is straight-forward, too, and fairly self-explanatory=20
(modulu figuring out good names for the functions). Finally, the return=20
value adds useful information in the case of always-insert which isn't=20
available with the operator[]-approach, and which is essentially free to=20
obtain.
What sets this interface apart from the other standard library containers=
=20
is that every other interface operates only in terms of the container's=20
*value_type*. For this proposal, it is necessary to separate out the=20
*key_type* and the *mapped_type*. Perhaps this departure from the norm is=
=20
why such an interface has been missing till now, and at the same time I=20
would hope that this is an argument in its favour. It would essentially go=
=20
under a "specialized algorithms" section.
Many thanks, and I would appreaciate your feedback greatly!
Thomas
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_830_17621246.1389130784151
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">I would like to propose a small addition to the standard l=
ibrary "map" and "unordered_map" containers to overcome some limitiations a=
nd relax some unnecessary restrictions.<div><br></div><div>Here is a small,=
motivating example (all in namespace std, for brevity):</div><div><br></di=
v><blockquote style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><di=
v><font face=3D"courier new, monospace" color=3D"#660000"><b>// Setup</b></=
font></div><div><div><font face=3D"courier new, monospace" color=3D"#660000=
"><b>map<string, unique_ptr<Foo>> m;</b></font></div></div><div=
><div><font face=3D"courier new, monospace" color=3D"#660000"><b>static_cas=
t<void>(m["foo"]);</b></font></div></div><div><div><font face=3D"cour=
ier new, monospace" color=3D"#660000"><b><br></b></font></div></div><div><d=
iv><font face=3D"courier new, monospace" color=3D"#660000"><b>// The proble=
m</b></font></div></div><div><div><font face=3D"courier new, monospace" col=
or=3D"#660000"><b>unique_ptr<Foo> p(new Foo);</b></font></div></div><=
div><div><font face=3D"courier new, monospace" color=3D"#660000"><b>auto it=
=3D m.emplace(</b></font><b style=3D"color: rgb(102, 0, 0); font-family: '=
courier new', monospace;">"foo"</b><b style=3D"color: rgb(102, 0, 0); font-=
family: 'courier new', monospace;">, move(p));</b></div></div></blockquote>=
<div><br></div><div>Question: What is the state of <font face=3D"courier ne=
w, monospace"><b>p</b></font>? Has it been moved from or not?</div><div><br=
></div><div>The answer is in fact that it depends on your library implement=
ation. Specifically, it may depend on something like whether internal helpe=
r functions use deducing templates or "value_type &&" arguments -- =
in fact, GCC 4.6.3 and GCC 4.8.2 give different answers (and it also differ=
s between ordered and unordered map).</div><div><br></div><div><b>The probl=
em</b> is that insert() may fail and not perform an insertion (if the key a=
lready exists), but in that case it is <i>unspecified</i> whether arguments=
are moved-from.</div><div><br></div><div><b>The solution</b>, after delibe=
ration and discussion, is to propose additional interfaces. Those will turn=
out to have further benefits besides solving the above problem, which are =
discussed below:</div><div><br></div><blockquote style=3D"margin: 0 0 0 40p=
x; border: none; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e13=
">Add the following interfaces to std::map and std::unordered_map:</font></=
div></div><div><div><font size=3D"2" color=3D"#274e13"><br></font></div></d=
iv></blockquote><blockquote style=3D"margin: 0 0 0 40px; border: none; padd=
ing: 0px;"><blockquote style=3D"margin: 0 0 0 40px; border: none; padding: =
0px;"><div><div><font size=3D"2" color=3D"#274e13" face=3D"courier new, mon=
ospace"><b>pair<iterator, bool></b></font></div></div></blockquote><b=
lockquote style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><d=
iv><font size=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b>st=
able_insert(key_type const & k, mapped_type const & v)</b></font></=
div></div></blockquote><blockquote style=3D"margin: 0 0 0 40px; border: non=
e; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e13" face=3D"cour=
ier new, monospace"><b><br></b></font></div></div></blockquote><blockquote =
style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><div><font s=
ize=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b>pair<iter=
ator, bool></b></font></div></div></blockquote><blockquote style=3D"marg=
in: 0 0 0 40px; border: none; padding: 0px;"><div><div><font size=3D"2" col=
or=3D"#274e13" face=3D"courier new, monospace"><b>stable_insert(key_type co=
nst & k, mapped_type && v)</b></font></div></div></blockquote><=
blockquote style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><=
div><font size=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b><=
br></b></font></div></div></blockquote><blockquote style=3D"margin: 0 0 0 4=
0px; border: none; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e=
13" face=3D"courier new, monospace"><b>template <typename ...Args></b=
></font></div></div></blockquote><blockquote style=3D"margin: 0 0 0 40px; b=
order: none; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e13" fa=
ce=3D"courier new, monospace"><b>pair<iterator, bool></b></font></div=
></div></blockquote><blockquote style=3D"margin: 0 0 0 40px; border: none; =
padding: 0px;"><div><div><font size=3D"2" color=3D"#274e13" face=3D"courier=
new, monospace"><b>stable_emplace(key_type const & k, Args &&.=
... args)</b></font></div></div></blockquote></blockquote><blockquote style=
=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><div><font size=
=3D"2" color=3D"#274e13"><br></font></div></div><div><div><font size=3D"2" =
color=3D"#274e13"><b>Description:</b> If the key <b><font face=3D"courier n=
ew, monospace">k</font></b> already exists in the map, does nothing. Otherw=
ise, inserts the element constructed from the arguments into the map. Retur=
ns [the same as insert/emplace].</font></div></div><div><div><font size=3D"=
2" color=3D"#274e13"><br></font></div></div><div><div><font size=3D"2" colo=
r=3D"#274e13"><b>Additional guarantee:</b> If the key already exists, no dy=
namic allocations are performed and no exceptions are thrown.</font></div><=
/div><div><div><font size=3D"2" color=3D"#274e13"><br></font></div></div><d=
iv><div><font size=3D"2" color=3D"#274e13">Also add the following:</font></=
div></div><div><div><font size=3D"2" color=3D"#274e13"><br></font></div></d=
iv></blockquote><blockquote style=3D"margin: 0 0 0 40px; border: none; padd=
ing: 0px;"><blockquote style=3D"margin: 0 0 0 40px; border: none; padding: =
0px;"><div><div><font size=3D"2" color=3D"#274e13" face=3D"courier new, mon=
ospace"><b>pair<iterator, bool></b></font></div></div></blockquote><b=
lockquote style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><d=
iv><font size=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b>al=
ways_insert(value_type const & x)</b></font></div></div></blockquote><b=
lockquote style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><d=
iv><font size=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b><b=
r></b></font></div></div></blockquote><blockquote style=3D"margin: 0 0 0 40=
px; border: none; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e1=
3" face=3D"courier new, monospace"><b>pair<iterator, bool></b></font>=
</div></div></blockquote><blockquote style=3D"margin: 0 0 0 40px; border: n=
one; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e13" face=3D"co=
urier new, monospace"><b>always_insert(value_type && x)</b></font><=
/div></div></blockquote><blockquote style=3D"margin: 0 0 0 40px; border: no=
ne; padding: 0px;"><div><div><font size=3D"2" color=3D"#274e13" face=3D"cou=
rier new, monospace"><b><br></b></font></div></div></blockquote><blockquote=
style=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><div><font =
size=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b>template &l=
t;typename ...Args></b></font></div></div></blockquote><blockquote style=
=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><div><font size=
=3D"2" color=3D"#274e13" face=3D"courier new, monospace"><b>pair<iterato=
r, bool></b></font></div></div></blockquote><blockquote style=3D"margin:=
0 0 0 40px; border: none; padding: 0px;"><div><div><font size=3D"2" color=
=3D"#274e13" face=3D"courier new, monospace"><b>always_emplace(Args &&a=
mp;... args)</b></font></div></div></blockquote></blockquote><blockquote st=
yle=3D"margin: 0 0 0 40px; border: none; padding: 0px;"><div><div><font siz=
e=3D"2" color=3D"#274e13"><br></font></div></div><div><div><font size=3D"2"=
color=3D"#274e13">Description: Inserts the element constructed from the ar=
guments at the key determined by that element. Returns: The iterator to the=
inserted element and a boolean indicating whether the key had already exis=
ted prior to the insertion. Note: This is an improved version of =93<b><fon=
t face=3D"courier new, monospace">m[k] =3D v;</font></b>=94 which does not =
require unnecessary default construction and returns a more informative res=
ult.</font></div></div></blockquote><div><br></div><div><b>A non-solution: =
</b>I would like to stress that neither of the following "alternatives" wor=
ks:</div><div><ul><li></li><li><font face=3D"courier new, monospace"><span =
style=3D"line-height: normal;">m.emplace(k, move(v))<br></span><span style=
=3D"line-height: normal;">m.insert(make_pair(k, move(v))) // or simi=
lar<br></span></font><span style=3D"line-height: normal;">This may create a=
pair object irrespective of whether the key already exists. The problem he=
re is that this takes ownership of move-only objects without guaranteeing t=
hat the acquired object is stored or destroyed immediately.<br><br></span><=
/li><li><span style=3D"line-height: normal;"><font face=3D"courier new, mon=
ospace">m.insert(pair<K &&, T &&>(move(k), move(v))</=
font><br></span><span style=3D"line-height: normal;">Likewise, this may or =
may not create a pair object even when the key already exists. This one is =
even more insidious, since it may look like the caller retains ownership if=
the insertion does not take place, but this is in fact not guaranteed. (Th=
e implementation may internally create a new <font face=3D"courier new, mon=
ospace">pair<K const, T></font> from the arguments before checking wh=
ether the insertion is needed.)</span></li></ul></div><div><b>A bit of disc=
ussion.</b> Several real-world codebases include helper functions to p=
erform "unconditional insertions" ("insert-or-update") into maps. All of th=
em require boilerplate, some are inefficient, and none are elegant. Here ar=
e a few possible implementations:</div><div><ul><li><font face=3D"courier n=
ew, monospace" style=3D"line-height: normal;">m[k] =3D v;</font> or <font f=
ace=3D"courier new, monospace"><span style=3D"line-height: normal;">m[k] =
=3D move(v);</span></font><br><span style=3D"line-height: normal;">This alw=
ays default-constructs an element, even though it is immediately assigned-t=
o. (Actually, this <i>requires</i> default-constructbility in the first pla=
ce, which isn't even necessary otherwise.)</span></li><li><font face=3D"cou=
rier new, monospace"><span style=3D"line-height: normal;">auto it =3D find(=
k)</span></font> followed by <font face=3D"courier new, monospace"><span st=
yle=3D"line-height: normal;">insert()</span></font> or <font face=3D"courie=
r new, monospace"><span style=3D"line-height: normal;">*it =3D move(v)</spa=
n></font><br>Not terrible, but it performs the map search (tree walk or has=
hing) twice.</li><li><font face=3D"courier new, monospace">auto it =3D lowe=
r_bound(k)</font> followed by hinted insertion<br>Near-optimal for std::map=
, but specific to the container (i.e. does not work for unordered maps), an=
d few people remember this one. Moreover, there is no analogue for std::uno=
rdered_map that would avoid repeated hashing.</li></ul><div><span style=3D"=
line-height: 17px;">The proposed new stable-insert/always-insert interfaces=
would remedy all this. They're as efficient as possible. Implementation is=
a straight-forward modification (at least as far as I can tell from libstd=
c++). Usage is straight-forward, too, and fairly self-explanatory (modulu f=
iguring out good names for the functions). Finally, the return value adds u=
seful information in the case of always-insert which isn't available with t=
he operator[]-approach, and which is essentially free to obtain.</span></di=
v></div><div><span style=3D"line-height: 17px;"><br></span></div><div><span=
style=3D"line-height: 17px;">What sets this interface apart from the other=
standard library containers is that every other interface operates only in=
terms of the container's <font face=3D"courier new, monospace"><b>value_ty=
pe</b></font>. For this proposal, it is necessary to separate out the <font=
face=3D"courier new, monospace"><b>key_type</b></font> and the <font face=
=3D"courier new, monospace"><b>mapped_type</b></font>. Perhaps this departu=
re from the norm is why such an interface has been missing till now, and at=
the same time I would hope that this is an argument in its favour. It woul=
d essentially go under a "specialized algorithms" section.</span></div><div=
><span style=3D"line-height: 17px;"><br></span></div><div><span style=3D"li=
ne-height: 17px;">Many thanks, and I would appreaciate your feedback greatl=
y!</span></div><div><span style=3D"line-height: 17px;"><br></span></div><di=
v><span style=3D"line-height: 17px;">Thomas</span></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_830_17621246.1389130784151--
.