Topic: memlaunder, an (incomplete) proposal to loosen the


Author: amluto@gmail.com
Date: Thu, 16 May 2013 11:55:22 -0700 (PDT)
Raw View
------=_Part_6_29247613.1368730522071
Content-Type: text/plain; charset=ISO-8859-1

There are a handful of decent reasons to want to engage in type-punning.
The improved C99/C++11 union rules help a bit, but some cases aren't
covered.  Here's an example:

void WriteSomething(const void *data, size_t len)
{
  assert(len % 4 == 0);
  auto d = reinterpret_cast<const uint32_t *d>(data);
  for (int i = 0; i < len/4; i++)
    Write4(d[i]);
}

void ReadSomething(void *data, size_t len)
{
  assert(len % 4 == 0);
  auto d = reinterpret_cast<uint32_t *d>(data);
  for (int i = 0; i < len/4; i++)
    d[i] = Read4();
}

This code is non-portable (it assumes 8-bit bytes, for one thing), but it
ought to work.  This use case results in undefined behavior, though:

void func()
{
  struct A { uint16_t x, y; };
  A a{1,2};
  WriteSomething(&a, sizeof(a));
  ReadSomething(&a, sizeof(a));
  std::cout << a.x << ' ' << a.y << std::endl;
}

The actual numbers passed to Write4 are implementation-depedent, but
there's a worse problem: a.x and a.y have been accessed through a glvalue
of type uint32_t, in violation of [basic.lval].10.

This case is nasty -- the ReadSomething and WriteSomething functions don't
know the types of the stored objects that they are accessing, so all they
can safety do is to access them through char or unsigned char types, which
is awkward at best.

I propose a fix that should require only a small library change and (maybe)
no or only minimal core language changes.  Define a new function memlaunder:

void memlaunder(void *dest, const void *src, size_t len);

If (dest != src), then memlaunder has no effect.  Otherwise memlaunder
behaves as if it copies len bytes, starting at src, to temporary storage,
and then copies the same bytes back to memory starting at dest.

Now one could write:

void ReadSomething(void *data, size_t len)
{
  assert(len % 4 == 0);
  auto d = reinterpret_cast<uint32_t *d>(data);

  /* The lifetime of the objects pointed to by data is now over -- the
storage is being reused to store an array of type uint32_t -- see
[basic.life].1-2. */

  for (int i = 0; i < len/4; i++)
    d[i] = Read4();

  memlaunder(data, data, len);

  /* The lifetime of the uint32_t objects ends after the first "copy" in
memlaunder.  The lifetime of whatever type of objects data originally
pointed to begins when the bytes are written. */
}

So long as (a) data points to trivially copyable objects and (b) whatever
bytes get written are a valid representation, this should not invoke
undefined behavior.

WriteSomething needs something fancier, and I don't know how to do that
without violating the rules that pointer that compare equal point to the
same object and that const objects shouldn't be written to or have their
lifetime end early.


Any thoughts?  memlaunder should be straightforward to implement on
existing compilers, since any uninlined function with that signature should
already have that effect.

--Andy

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/?hl=en.



------=_Part_6_29247613.1368730522071
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

There are a handful of decent reasons to want to engage in type-punning.&nb=
sp;&nbsp;&nbsp; The improved C99/C++11 union rules help a bit, but some cas=
es aren't covered.&nbsp; Here's an example:<br><br>void WriteSomething(cons=
t void *data, size_t len)<br>{<br>&nbsp; assert(len % 4 =3D=3D 0);<br>&nbsp=
; auto d =3D reinterpret_cast&lt;const uint32_t *d&gt;(data);<br>&nbsp; for=
 (int i =3D 0; i &lt; len/4; i++)<br>&nbsp;&nbsp;&nbsp; Write4(d[i]);<br>}<=
br><br>void ReadSomething(void *data, size_t len)<br>{<br>&nbsp; assert(len=
 % 4 =3D=3D 0);<br>&nbsp; auto d =3D reinterpret_cast&lt;uint32_t *d&gt;(da=
ta);<br>&nbsp; for (int i =3D 0; i &lt; len/4; i++)<br>&nbsp;&nbsp;&nbsp; d=
[i] =3D Read4();<br>}<br><br>This code is non-portable (it assumes 8-bit by=
tes, for one thing), but it ought to work.&nbsp; This use case results in u=
ndefined behavior, though:<br><br>void func()<br>{<br>&nbsp; struct A { uin=
t16_t x, y; };<br>&nbsp; A a{1,2};<br>&nbsp; WriteSomething(&amp;a, sizeof(=
a));<br>&nbsp; ReadSomething(&amp;a, sizeof(a));<br>&nbsp; std::cout &lt;&l=
t; a.x &lt;&lt; ' ' &lt;&lt; a.y &lt;&lt; std::endl;<br>}<br><br>The actual=
 numbers passed to Write4 are implementation-depedent, but there's a worse =
problem: a.x and a.y have been accessed through a glvalue of type uint32_t,=
 in violation of [basic.lval].10.<br><br>This case is nasty -- the ReadSome=
thing and WriteSomething functions don't know the types of the stored objec=
ts that they are accessing, so all they can safety do is to access them thr=
ough char or unsigned char types, which is awkward at best.<br><br>I propos=
e a fix that should require only a small library change and (maybe) no or o=
nly minimal core language changes.&nbsp; Define a new function memlaunder:<=
br><br>void memlaunder(void *dest, const void *src, size_t len);<br><br>If =
(dest !=3D src), then memlaunder has no effect.&nbsp; Otherwise memlaunder =
behaves as if it copies len bytes, starting at src, to temporary storage, a=
nd then copies the same bytes back to memory starting at dest.<br><br>Now o=
ne could write:<br><br>void ReadSomething(void *data, size_t len)<br>{<br>&=
nbsp; assert(len % 4 =3D=3D 0);<br>&nbsp; auto d =3D reinterpret_cast&lt;ui=
nt32_t *d&gt;(data);<br><br>&nbsp; /* The lifetime of the objects pointed t=
o by data is now over -- the storage is being reused to store an array of t=
ype uint32_t -- see [basic.life].1-2. */<br><br>&nbsp; for (int i =3D 0; i =
&lt; len/4; i++)<br>&nbsp;&nbsp;&nbsp; d[i] =3D Read4();<br><br>&nbsp; meml=
aunder(data, data, len);<br><br>&nbsp; /* The lifetime of the uint32_t obje=
cts ends after the first "copy" in memlaunder.&nbsp; The lifetime of whatev=
er type of objects data originally pointed to begins when the bytes are wri=
tten. */<br>}<br><br>So long as (a) data points to trivially copyable objec=
ts and (b) whatever bytes get written are a valid representation, this shou=
ld not invoke undefined behavior.<br><br>WriteSomething needs something fan=
cier, and I don't know how to do that without violating the rules that poin=
ter that compare equal point to the same object and that const objects shou=
ldn't be written to or have their lifetime end early.<br><br><br>Any though=
ts?&nbsp; memlaunder should be straightforward to implement on existing com=
pilers, since any uninlined function with that signature should already hav=
e that effect.<br><br>--Andy<br>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/?hl=3Den">http://groups.google.com/a/isocpp.org/group/std-pro=
posals/?hl=3Den</a>.<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_6_29247613.1368730522071--

.