Topic: Allow .data() to be called on reserved, but empty, vectors


Author: stevemk14ebr@gmail.com
Date: Fri, 21 Jul 2017 19:40:25 -0700 (PDT)
Raw View
------=_Part_1887_1576352135.1500691225324
Content-Type: multipart/alternative;
 boundary="----=_Part_1888_985726380.1500691225324"

------=_Part_1888_985726380.1500691225324
Content-Type: text/plain; charset="UTF-8"

Sometimes when using vectors in low-level embedded land it is nice to know
where the underlying buffer of a vector is before any data exists in the
vector. As it stands it is not possible to do this, even if first calling
reserve.

Look at the following code:
std::vector<uint8_t> vec;
vec.reserve(32);
uint64_t bufStart = ...
Assume now that i want to use the vector to hold assembly instructions.
Various instructions' encoding are relative to their location in memory, so
in order to encode them properly one must know where they are about to live
inside the vector. A naive user might try one of the following:

0) bufStart = &vec[0]
1) bufStart = &vec.at(0)
2) bufStart = &vec.front()
3) bufStart = vec.data()

All of these are U.B.
0) Will compile and may even work properly on some compilers, but
operator[] is undefined when the vector is empty.
1) Will probably compiler will assert on runtime about the .size() of
vector being to small
2) Will compile but is U.B.
3) Will compiler and even worse is sometimes optimized to return 0 by gcc.
Also U.B

Number 3 is the one i propose to change. Currently as it is the standard is
worded as such:

23.3.6.4 [vector.data]
|
|    T* data() noexcept;
|    const T* data() const noexcept;
|
|    Returns: A pointer such that [data(),data() + size()) is a valid
|    range. *For a non-empty vector, data() == &front()*.

Changed to

23.3.6.4 [vector.data]
|
|    T* data() noexcept;
|    const T* data() const noexcept;
|
|    Returns: A pointer such that [data(),data() + size()) is a valid
|    range. For a vector of capacity() > 0, data() must point to the
underlying buffer.

This will allow people to grab a pointer to the underlying buffer of a
vector if and only if they have reserved space first. I suspect many people
incorrectly expect one of the above methods I have listed to work as
intended (though they do not).

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/079d789f-8ab3-4669-8d4b-7167bc1d4dde%40isocpp.org.

------=_Part_1888_985726380.1500691225324
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Sometimes when using vectors in low-level embedded land it=
 is nice to know where the underlying buffer of a vector is before any data=
 exists in the vector. As it stands it is not possible to do this, even if =
first calling reserve.<div><br></div><div>Look at the following code:</div>=
<div>std::vector&lt;uint8_t&gt; vec;</div><div>vec.reserve(32);</div><div>u=
int64_t bufStart=C2=A0=3D ...</div><div>Assume now that i want to use the v=
ector to hold assembly instructions. Various instructions&#39; encoding are=
 relative to their location in memory, so in order to encode them properly =
one must know where they are about to live inside the vector. A naive user =
might try one of the following:</div><div><br></div><div>0) bufStart =3D &a=
mp;vec[0]</div><div>1) bufStart =3D &amp;vec.at(0)</div><div>2) bufStart =
=3D &amp;vec.front()</div><div>3) bufStart=C2=A0=3D vec.data()</div><div><b=
r></div><div>All of these are U.B.</div><div>0) Will compile and may even w=
ork properly on some compilers, but operator[] is undefined when the vector=
 is empty.</div><div>1) Will probably compiler will assert on runtime about=
 the .size() of vector being to small</div><div>2) Will compile but is U.B.=
</div><div>3) Will compiler and even worse is sometimes optimized to return=
 0 by gcc. Also U.B</div><div><br></div><div>Number 3 is the one i propose =
to change. Currently as it is the standard is worded as such:</div><div><br=
></div><div><span style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, san=
s-serif; font-size: 12.8px;">23.3.6.4 [vector.data]</span><br style=3D"font=
-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;"><s=
pan style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-=
size: 12.8px;">|</span><br style=3D"font-family: &quot;Open Sans&quot;, Seg=
oeUI, sans-serif; font-size: 12.8px;"><span style=3D"font-family: &quot;Ope=
n Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;">|=C2=A0 =C2=A0 T* da=
ta() noexcept;</span><br style=3D"font-family: &quot;Open Sans&quot;, Segoe=
UI, sans-serif; font-size: 12.8px;"><span style=3D"font-family: &quot;Open =
Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;">|=C2=A0 =C2=A0 const T=
* data() const noexcept;</span><br style=3D"font-family: &quot;Open Sans&qu=
ot;, SegoeUI, sans-serif; font-size: 12.8px;"><span style=3D"font-family: &=
quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;">|</span><br =
style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size=
: 12.8px;"><span style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans=
-serif; font-size: 12.8px;">|=C2=A0 =C2=A0 Returns: A pointer such that [da=
ta(),data() + size()) is a valid</span><br style=3D"font-family: &quot;Open=
 Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;"><span style=3D"font-f=
amily: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;">|=C2=
=A0 =C2=A0 range. <b>For a non-empty vector, data() =3D=3D &amp;front()</b>=
..</span></div><div><span style=3D"font-family: &quot;Open Sans&quot;, Segoe=
UI, sans-serif; font-size: 12.8px;"><br></span></div><div><span style=3D"fo=
nt-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;">=
Changed to</span></div><div><span style=3D"font-family: &quot;Open Sans&quo=
t;, SegoeUI, sans-serif; font-size: 12.8px;"><br></span></div><div><span st=
yle=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: =
12.8px;">23.3.6.4 [vector.data]</span><br style=3D"font-family: &quot;Open =
Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;"><span style=3D"font-fa=
mily: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;">|</sp=
an><br style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; fo=
nt-size: 12.8px;"><span style=3D"font-family: &quot;Open Sans&quot;, SegoeU=
I, sans-serif; font-size: 12.8px;">|=C2=A0 =C2=A0 T* data() noexcept;</span=
><br style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font=
-size: 12.8px;"><span style=3D"font-family: &quot;Open Sans&quot;, SegoeUI,=
 sans-serif; font-size: 12.8px;">|=C2=A0 =C2=A0 const T* data() const noexc=
ept;</span><br style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-s=
erif; font-size: 12.8px;"><span style=3D"font-family: &quot;Open Sans&quot;=
, SegoeUI, sans-serif; font-size: 12.8px;">|</span><br style=3D"font-family=
: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;"><span sty=
le=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 1=
2.8px;">|=C2=A0 =C2=A0 Returns: A pointer such that [data(),data() + size()=
) is a valid</span><br style=3D"font-family: &quot;Open Sans&quot;, SegoeUI=
, sans-serif; font-size: 12.8px;"><span style=3D"font-family: &quot;Open Sa=
ns&quot;, SegoeUI, sans-serif; font-size: 12.8px;">|=C2=A0 =C2=A0 range. Fo=
r a vector of capacity() &gt; 0, data() must point to the underlying buffer=
..</span><span style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans-se=
rif; font-size: 12.8px;"><br></span></div><div><span style=3D"font-family: =
&quot;Open Sans&quot;, SegoeUI, sans-serif; font-size: 12.8px;"><br></span>=
</div><div><span style=3D"font-family: &quot;Open Sans&quot;, SegoeUI, sans=
-serif; font-size: 12.8px;">This will allow people to grab a pointer to the=
 underlying buffer of a vector if and only if they have reserved space firs=
t. I suspect many people incorrectly expect one of the above methods I have=
 listed to work as intended (though they do not).</span></div></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/079d789f-8ab3-4669-8d4b-7167bc1d4dde%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/079d789f-8ab3-4669-8d4b-7167bc1d4dde=
%40isocpp.org</a>.<br />

------=_Part_1888_985726380.1500691225324--

------=_Part_1887_1576352135.1500691225324--

.