Topic: memory_order_signal
Author: Giovanni Piero Deretta <gpderetta@gmail.com>
Date: Wed, 30 Sep 2015 01:46:14 -0700 (PDT)
Raw View
------=_Part_2613_3508153.1443602774334
Content-Type: multipart/alternative;
boundary="----=_Part_2614_462091708.1443602774334"
------=_Part_2614_462091708.1443602774334
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Hi all,
As small addition to the C++ memory model, it would be nice to have a way=
=20
to request compiler-only synchronization for all atomic<> ops, like the one=
=20
imposed by atomic_signal_fence. Either as a memory_order_signal flag to be=
=20
'or'ed to the other memory_order_* flags or a full family of=20
memory_order_signal_{relaxed,release,acquire,acq_rel,seq_cst}.
These new memory model flags would use the same wording as for=20
atomic_signal_fence:
"E=EF=AC=80ects: memory_order_signal_* is equivalent to the corresponding m=
emory_order_*, except that the resulting ordering constraints =20
are established only between a thread and a signal handler executed in the =
same thread."
"Note: compiler optimizations and reorderings of loads and stores are inhib=
ited in the same way as with memory_order_* operations,=20
but the hardware fence instructions that memory_order_* would have inserted=
are not emitted."
The expectation is that for x86, for example, memory_order_signal RMW=20
operations would map to the the underlying atomic RMW *without* the lock=20
prefix.
My use case is synchronization between threads pinned to the same core.
-- gpd
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_2614_462091708.1443602774334
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">Hi all,<br><br>As small addition to the C++ memory model, =
it would be nice to have a way to request compiler-only synchronization for=
all atomic<> ops, like the one imposed by atomic_signal_fence. Eithe=
r as a memory_order_signal flag to be 'or'ed to the other memory_or=
der_* flags or a full family of memory_order_signal_{relaxed,release,acquir=
e,acq_rel,seq_cst}.<br><br>These new memory model flags would use the same =
wording as for atomic_signal_fence:<br><br><pre class=3D"bz_comment_text"><=
span style=3D"font-family: arial,sans-serif;">"E=EF=AC=80ects: memory_=
order_signal_* is equivalent to the corresponding memory_order_*, except t=
hat the resulting ordering constraints <br>are established only between a =
thread and a signal handler executed in the same thread."
<br></span><span style=3D"font-family: arial,sans-serif;"><span style=3D"fo=
nt-family: arial,sans-serif;"><span style=3D"font-family: arial,sans-serif;=
">"Note: compiler optimizations and reorderings of loads and stores ar=
e inhibited in the same way as with memory_order_* operations, </span></spa=
n></span><span style=3D"font-family: arial,sans-serif;"><br></span><span st=
yle=3D"font-family: arial,sans-serif;">but the hardware fence instructions =
that memory_order_* would have inserted are not emitted.</span>"<br></=
pre><br>The expectation is that for x86, for example, memory_order_signal R=
MW operations would map to the the underlying atomic RMW *without* the lock=
prefix.<br><br>My use case is synchronization between threads pinned to th=
e same core.<br><br>-- gpd<br></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_2614_462091708.1443602774334--
------=_Part_2613_3508153.1443602774334--
.
Author: Ed Schouten <ed@nuxi.nl>
Date: Wed, 30 Sep 2015 16:17:26 +0200
Raw View
Hi Giovanni,
2015-09-30 10:46 GMT+02:00 Giovanni Piero Deretta <gpderetta@gmail.com>:
> "E=EF=AC=80ects: memory_order_signal_* is equivalent to the corresponding
> memory_order_*, except that the resulting ordering constraints
> are established only between a thread and a signal handler executed in th=
e
> same thread."
How would this be different from using memory_order_relaxed?
--=20
Ed Schouten <ed@nuxi.nl>
Nuxi, 's-Hertogenbosch, the Netherlands
KvK-nr.: 62051717
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Giovanni Piero Deretta <gpderetta@gmail.com>
Date: Wed, 30 Sep 2015 07:58:06 -0700 (PDT)
Raw View
------=_Part_1260_427831079.1443625086888
Content-Type: multipart/alternative;
boundary="----=_Part_1261_1203722531.1443625086889"
------=_Part_1261_1203722531.1443625086889
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Wednesday, September 30, 2015 at 3:17:29 PM UTC+1, Ed Schouten wrote:
>
> Hi Giovanni,=20
>
> 2015-09-30 10:46 GMT+02:00 Giovanni Piero Deretta <gpde...@gmail.com=20
> <javascript:>>:=20
> > "E=EF=AC=80ects: memory_order_signal_* is equivalent to the correspondi=
ng=20
> > memory_order_*, except that the resulting ordering constraints=20
> > are established only between a thread and a signal handler executed in=
=20
> the=20
> > same thread."=20
>
> How would this be different from using memory_order_relaxed?=20
>
Memory order relaxed guarantees atomicity of RMW across threads, even=20
running on other cpus. memory_order_signal would only guarantee atomicity=
=20
with RMW done in a thread and a signal handler running on that thread.=20
Conversely memory_order_relaxed allows almost complete freedom to reorder=
=20
accesses to the compiler, while a memory_order_signal_seq_cst would not.
In fact the "proposed wording" I wrote are not enough as they only deal=20
with synchronisation, but something must be said about atomicity (probably=
=20
saying that concurrent writes or mixed read/writes accesses where at least=
=20
one of them is memory order signal is a data race).
-- gpd
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_1261_1203722531.1443625086889
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">On Wednesday, September 30, 2015 at 3:17:29 PM UTC+1, Ed S=
chouten wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-l=
eft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Hi Giovanni,
<br>
<br>2015-09-30 10:46 GMT+02:00 Giovanni Piero Deretta <<a href=3D"javasc=
ript:" target=3D"_blank" gdf-obfuscated-mailto=3D"DjGdyPtLCgAJ" rel=3D"nofo=
llow" onmousedown=3D"this.href=3D'javascript:';return true;" onclic=
k=3D"this.href=3D'javascript:';return true;">gpde...@gmail.com</a>&=
gt;:
<br>> "E=EF=AC=80ects: memory_order_signal_* is equivalent to the c=
orresponding
<br>> memory_order_*, =C2=A0except that the resulting ordering constrain=
ts
<br>> are established only between a thread and a signal handler execute=
d in the
<br>> same thread."
<br>
<br>How would this be different from using memory_order_relaxed?
<br></blockquote><div><br>Memory order relaxed guarantees atomicity of RMW =
across threads, even running on other cpus. memory_order_signal would only =
guarantee atomicity with RMW done in a thread and a signal handler running =
on that thread. Conversely memory_order_relaxed allows almost complete free=
dom to reorder accesses to the compiler, while a memory_order_signal_seq_cs=
t would not.<br><br>In fact the "proposed wording" I wrote are no=
t enough as they only deal with synchronisation, but something must be said=
about atomicity (probably saying that concurrent writes or mixed read/writ=
es accesses where at least one of them is memory order signal is a data rac=
e).<br></div><br>-- gpd<br></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_1261_1203722531.1443625086889--
------=_Part_1260_427831079.1443625086888--
.
Author: Andrey Semashev <andrey.semashev@gmail.com>
Date: Wed, 30 Sep 2015 18:15:39 +0300
Raw View
On 30.09.2015 11:46, Giovanni Piero Deretta wrote:
> Hi all,
>
> As small addition to the C++ memory model, it would be nice to have a
> way to request compiler-only synchronization for all atomic<> ops, like
> the one imposed by atomic_signal_fence. Either as a memory_order_signal
> flag to be 'or'ed to the other memory_order_* flags or a full family of
> memory_order_signal_{relaxed,release,acquire,acq_rel,seq_cst}.
>
> These new memory model flags would use the same wording as for
> atomic_signal_fence:
>
> "E=EF=AC=80ects: memory_order_signal_* is equivalent to the corresponding
> memory_order_*, except that the resulting ordering constraints
> are established only between a thread and a signal handler executed in
> the same thread."
> "Note: compiler optimizations and reorderings of loads and stores are
> inhibited in the same way as with memory_order_* operations,
> but the hardware fence instructions that memory_order_* would have
> inserted are not emitted."
>
>
> The expectation is that for x86, for example, memory_order_signal RMW
> operations would map to the the underlying atomic RMW *without* the lock
> prefix.
>
> My use case is synchronization between threads pinned to the same core.
I think, what you propose is very close to marking a regular variable=20
volatile. It doesn't guarantee atomicity wrt signal handlers in all=20
cases, but the enhanced atomic<> as you propose it won't be able to=20
provide any stronger guarantees. In particular, I'm thinking about the=20
case when atomic<> has to emulate atomicity with a lock pool. In this=20
case operations on the atomic value are not actually atomic and a signal=20
handler can be invoked in the middle of an operation.
If atomic<> is lock-free (i.e. natively supported by the hardware), then=20
I see no reason for the compiler to make volatile accesses to the same=20
type non-atomic (again, in the scope of a single thread).
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: Giovanni Piero Deretta <gpderetta@gmail.com>
Date: Wed, 30 Sep 2015 09:13:22 -0700 (PDT)
Raw View
------=_Part_7405_1200696753.1443629602075
Content-Type: multipart/alternative;
boundary="----=_Part_7406_1835166833.1443629602076"
------=_Part_7406_1835166833.1443629602076
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey Semashev wrote=
:
>
> On 30.09.2015 11:46, Giovanni Piero Deretta wrote:=20
> > Hi all,=20
> >=20
> > As small addition to the C++ memory model, it would be nice to have a=
=20
> > way to request compiler-only synchronization for all atomic<> ops, like=
=20
> > the one imposed by atomic_signal_fence. Either as a memory_order_signal=
=20
> > flag to be 'or'ed to the other memory_order_* flags or a full family of=
=20
> > memory_order_signal_{relaxed,release,acquire,acq_rel,seq_cst}.=20
> >=20
> > These new memory model flags would use the same wording as for=20
> > atomic_signal_fence:=20
> >=20
> > "E=EF=AC=80ects: memory_order_signal_* is equivalent to the correspondi=
ng=20
> > memory_order_*, except that the resulting ordering constraints=20
> > are established only between a thread and a signal handler executed in=
=20
> > the same thread."=20
> > "Note: compiler optimizations and reorderings of loads and stores are=
=20
> > inhibited in the same way as with memory_order_* operations,=20
> > but the hardware fence instructions that memory_order_* would have=20
> > inserted are not emitted."=20
> >=20
> >=20
> > The expectation is that for x86, for example, memory_order_signal RMW=
=20
> > operations would map to the the underlying atomic RMW *without* the loc=
k=20
> > prefix.=20
> >=20
> > My use case is synchronization between threads pinned to the same core.=
=20
>
> I think, what you propose is very close to marking a regular variable=20
> volatile. It doesn't guarantee atomicity wrt signal handlers in all=20
> cases, but the enhanced atomic<> as you propose it won't be able to=20
> provide any stronger guarantees. In particular, I'm thinking about the=20
> case when atomic<> has to emulate atomicity with a lock pool. In this=20
> case operations on the atomic value are not actually atomic and a signal=
=20
> handler can be invoked in the middle of an operation.=20
>
> If atomic<> is lock-free (i.e. natively supported by the hardware), then=
=20
> I see no reason for the compiler to make volatile accesses to the same=20
> type non-atomic (again, in the scope of a single thread).=20
>
>
volatile load and stores are very likely already atomic, yes, but what I=20
need is compare_exchange, exchange and friends to be atomic as well, but=20
only with regard to signal handlers. Note that this is a weaker guarantee=
=20
than normal accesses, that, on some architectures can be implemented with=
=20
significantly less expensive code sequences.
Of course if the atomic is not lock free, you can easily deadlock, but then=
=20
again that is the cae case even with the existing atomic memory ordering.
I could get the same functionality by writing inline asm, but I would=20
prefer a standard implementation. As we already have atomic_signal_fence,=
=20
the need has already been recognised.
-- gpd
=20
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_7406_1835166833.1443629602076
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey Semashev wrote=
:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;">On 30.09.2015 11:46, Giovanni=
Piero Deretta wrote:
<br>> Hi all,
<br>>
<br>> As small addition to the C++ memory model, it would be nice to hav=
e a
<br>> way to request compiler-only synchronization for all atomic<>=
; ops, like
<br>> the one imposed by atomic_signal_fence. Either as a memory_order_s=
ignal
<br>> flag to be 'or'ed to the other memory_order_* flags or a f=
ull family of
<br>> memory_order_signal_{relaxed,<wbr>release,acquire,acq_rel,seq_<wbr=
>cst}.
<br>>
<br>> These new memory model flags would use the same wording as for
<br>> atomic_signal_fence:
<br>>
<br>> "E=EF=AC=80ects: memory_order_signal_* is equivalent to the c=
orresponding
<br>> memory_order_*, except that the resulting ordering constraints
<br>> are established only between a thread and a signal handler execute=
d in
<br>> the same thread."
<br>> "Note: compiler optimizations and reorderings of loads and st=
ores are
<br>> inhibited in the same way as with memory_order_* operations,
<br>> but the hardware fence instructions that memory_order_* would have
<br>> inserted are not emitted."
<br>>
<br>>
<br>> The expectation is that for x86, for example, memory_order_signal =
RMW
<br>> operations would map to the the underlying atomic RMW *without* th=
e lock
<br>> prefix.
<br>>
<br>> My use case is synchronization between threads pinned to the same =
core.
<br>
<br>I think, what you propose is very close to marking a regular variable=
=20
<br>volatile. It doesn't guarantee atomicity wrt signal handlers in all=
=20
<br>cases, but the enhanced atomic<> as you propose it won't be a=
ble to=20
<br>provide any stronger guarantees. In particular, I'm thinking about =
the=20
<br>case when atomic<> has to emulate atomicity with a lock pool. In =
this=20
<br>case operations on the atomic value are not actually atomic and a signa=
l=20
<br>handler can be invoked in the middle of an operation.
<br>
<br>If atomic<> is lock-free (i.e. natively supported by the hardware=
), then=20
<br>I see no reason for the compiler to make volatile accesses to the same=
=20
<br>type non-atomic (again, in the scope of a single thread).
<br>
<br></blockquote><div><br>volatile load and stores are very likely already =
atomic, yes, but what I need is compare_exchange, exchange and friends to b=
e atomic as well, but only with regard to signal handlers. Note that this i=
s a weaker guarantee than normal accesses, that, on some architectures can =
be implemented with significantly less expensive code sequences.<br><br>Of =
course if the atomic is not lock free, you can easily deadlock, but then ag=
ain that is the cae case even with the existing atomic memory ordering.<br>=
<br>I could get the same functionality by writing inline asm, but I would p=
refer a standard implementation. As we already have atomic_signal_fence, th=
e need has already been recognised.<br><br>-- gpd<br>=C2=A0</div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_7406_1835166833.1443629602076--
------=_Part_7405_1200696753.1443629602075--
.
Author: Andrey Semashev <andrey.semashev@gmail.com>
Date: Wed, 30 Sep 2015 19:38:00 +0300
Raw View
On 30.09.2015 19:13, Giovanni Piero Deretta wrote:
> On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey Semashev wrote:
>
> I think, what you propose is very close to marking a regular variable
> volatile. It doesn't guarantee atomicity wrt signal handlers in all
> cases, but the enhanced atomic<> as you propose it won't be able to
> provide any stronger guarantees. In particular, I'm thinking about the
> case when atomic<> has to emulate atomicity with a lock pool. In this
> case operations on the atomic value are not actually atomic and a
> signal
> handler can be invoked in the middle of an operation.
>
> If atomic<> is lock-free (i.e. natively supported by the hardware),
> then
> I see no reason for the compiler to make volatile accesses to the same
> type non-atomic (again, in the scope of a single thread).
>
>
> volatile load and stores are very likely already atomic, yes, but what I
> need is compare_exchange, exchange and friends to be atomic as well, but
> only with regard to signal handlers. Note that this is a weaker
> guarantee than normal accesses, that, on some architectures can be
> implemented with significantly less expensive code sequences.
On some architectures there are no dedicated CAS or exchange
instructions. The atomic operations can be implemented in terms of LL/SC
pair. If you translate LL/SC to normal loads and stores with your
proposed memory model arguments then the operations won't be atomic wrt
a signal handler. If not then you don't win anything.
> I could get the same functionality by writing inline asm, but I would
> prefer a standard implementation. As we already have
> atomic_signal_fence, the need has already been recognised.
atomic_signal_fence does not constitute an operation, it simply acts as
a compiler barrier. Your proposal is significantly different as it
requires the operations to be still atomic.
If you are targeting a specific architecture that has CAS and exchange
instructions then you should probably write in asm/intrinsics or exploit
additional guarantees atomic<> de-facto provides on that architecture.
Naturally, such code won't be portable to other architectures.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Giovanni Piero Deretta <gpderetta@gmail.com>
Date: Wed, 30 Sep 2015 10:00:29 -0700 (PDT)
Raw View
------=_Part_752_638018047.1443632429139
Content-Type: multipart/alternative;
boundary="----=_Part_753_969165670.1443632429139"
------=_Part_753_969165670.1443632429139
Content-Type: text/plain; charset=UTF-8
On Wednesday, September 30, 2015 at 5:38:05 PM UTC+1, Andrey Semashev wrote:
>
> On 30.09.2015 19:13, Giovanni Piero Deretta wrote:
> > On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey Semashev
> wrote:
> >
> > I think, what you propose is very close to marking a regular
> variable
> > volatile. It doesn't guarantee atomicity wrt signal handlers in all
> > cases, but the enhanced atomic<> as you propose it won't be able to
> > provide any stronger guarantees. In particular, I'm thinking about
> the
> > case when atomic<> has to emulate atomicity with a lock pool. In
> this
> > case operations on the atomic value are not actually atomic and a
> > signal
> > handler can be invoked in the middle of an operation.
> >
> > If atomic<> is lock-free (i.e. natively supported by the hardware),
> > then
> > I see no reason for the compiler to make volatile accesses to the
> same
> > type non-atomic (again, in the scope of a single thread).
> >
> >
> > volatile load and stores are very likely already atomic, yes, but what I
> > need is compare_exchange, exchange and friends to be atomic as well, but
> > only with regard to signal handlers. Note that this is a weaker
> > guarantee than normal accesses, that, on some architectures can be
> > implemented with significantly less expensive code sequences.
>
> On some architectures there are no dedicated CAS or exchange
> instructions. The atomic operations can be implemented in terms of LL/SC
> pair. If you translate LL/SC to normal loads and stores with your
> proposed memory model arguments then the operations won't be atomic wrt
> a signal handler. If not then you don't win anything.
>
>
Mapping memory_order_signal_* to plain memory_order_* would be a
conforming implementation although suboptimal: I would expect at the very
least that on LL/SC architectures the signal variants could be implemented
by omitting the processor memory barriers required to implement stricter
ordering than consume.
> > I could get the same functionality by writing inline asm, but I would
> > prefer a standard implementation. As we already have
> > atomic_signal_fence, the need has already been recognised.
>
> atomic_signal_fence does not constitute an operation, it simply acts as
> a compiler barrier.
I did not say it is an operation, but it has very specific semantics that
show that the designers of the C++ memory model did acknowledge the need of
more exotic synchronization scenarios (the intended use case of
atomic_signal_fence is not really synchronization with signals).
-- gpd
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_753_969165670.1443632429139
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Wednesday, September 30, 2015 at 5:38:05 PM UTC+1, Andrey Semashev wrote=
:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;">On 30.09.2015 19:13, Giovanni=
Piero Deretta wrote:
<br>> On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey Semas=
hev wrote:
<br>>
<br>> =C2=A0 =C2=A0 I think, what you propose is very close to marking a=
regular variable
<br>> =C2=A0 =C2=A0 volatile. It doesn't guarantee atomicity wrt sig=
nal handlers in all
<br>> =C2=A0 =C2=A0 cases, but the enhanced atomic<> as you propos=
e it won't be able to
<br>> =C2=A0 =C2=A0 provide any stronger guarantees. In particular, I=
9;m thinking about the
<br>> =C2=A0 =C2=A0 case when atomic<> has to emulate atomicity wi=
th a lock pool. In this
<br>> =C2=A0 =C2=A0 case operations on the atomic value are not actually=
atomic and a
<br>> =C2=A0 =C2=A0 signal
<br>> =C2=A0 =C2=A0 handler can be invoked in the middle of an operation=
..
<br>>
<br>> =C2=A0 =C2=A0 If atomic<> is lock-free (i.e. natively suppor=
ted by the hardware),
<br>> =C2=A0 =C2=A0 then
<br>> =C2=A0 =C2=A0 I see no reason for the compiler to make volatile ac=
cesses to the same
<br>> =C2=A0 =C2=A0 type non-atomic (again, in the scope of a single thr=
ead).
<br>>
<br>>
<br>> volatile load and stores are very likely already atomic, yes, but =
what I
<br>> need is compare_exchange, exchange and friends to be atomic as wel=
l, but
<br>> only with regard to signal handlers. Note that this is a weaker
<br>> guarantee than normal accesses, that, on some architectures can be
<br>> implemented with significantly less expensive code sequences.
<br>
<br>On some architectures there are no dedicated CAS or exchange=20
<br>instructions. The atomic operations can be implemented in terms of LL/S=
C=20
<br>pair. If you translate LL/SC to normal loads and stores with your=20
<br>proposed memory model arguments then the operations won't be atomic=
wrt=20
<br>a signal handler. If not then you don't win anything.
<br>
<br></blockquote><div><br>Mapping memory_order_signal_*=C2=A0 to plain memo=
ry_order_* would be a conforming implementation although suboptimal:=C2=A0 =
I would expect at the very least that on LL/SC architectures the signal va=
riants could be implemented by omitting the processor memory barriers requi=
red to implement stricter ordering than consume. <br>=C2=A0</div><blockquot=
e class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: =
1px #ccc solid;padding-left: 1ex;">> I could get the same functionality =
by writing inline asm, but I would
<br>> prefer a standard implementation. As we already have
<br>> atomic_signal_fence, the need has already been recognised.
<br>
<br>atomic_signal_fence does not constitute an operation, it simply acts as=
=20
<br>a compiler barrier. </blockquote><div><br>I did not say it is an operat=
ion, but it has very specific semantics that show that the designers of the=
C++ memory model did acknowledge the need of more exotic synchronization s=
cenarios (the intended use case of atomic_signal_fence is not really synchr=
onization with signals).<br>=C2=A0<br>-- gpd<br></div><br><div>=C2=A0</div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_753_969165670.1443632429139--
------=_Part_752_638018047.1443632429139--
.
Author: Thiago Macieira <thiago@macieira.org>
Date: Wed, 30 Sep 2015 10:16:22 -0700
Raw View
On Wednesday 30 September 2015 10:00:29 Giovanni Piero Deretta wrote:
> I did not say it is an operation, but it has very specific semantics that
> show that the designers of the C++ memory model did acknowledge the need of
> more exotic synchronization scenarios (the intended use case of
> atomic_signal_fence is not really synchronization with signals).
It's not? What is it for? And why does it have "signal" in the name?
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Andrey Semashev <andrey.semashev@gmail.com>
Date: Wed, 30 Sep 2015 21:16:14 +0300
Raw View
On 30.09.2015 20:00, Giovanni Piero Deretta wrote:
> On Wednesday, September 30, 2015 at 5:38:05 PM UTC+1, Andrey Semashev wrote:
>
> On 30.09.2015 19:13, Giovanni Piero Deretta wrote:
> > On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey
> Semashev wrote:
> >
> > I think, what you propose is very close to marking a regular
> variable
> > volatile. It doesn't guarantee atomicity wrt signal handlers
> in all
> > cases, but the enhanced atomic<> as you propose it won't be
> able to
> > provide any stronger guarantees. In particular, I'm thinking
> about the
> > case when atomic<> has to emulate atomicity with a lock pool.
> In this
> > case operations on the atomic value are not actually atomic
> and a
> > signal
> > handler can be invoked in the middle of an operation.
> >
> > If atomic<> is lock-free (i.e. natively supported by the
> hardware),
> > then
> > I see no reason for the compiler to make volatile accesses to
> the same
> > type non-atomic (again, in the scope of a single thread).
> >
> >
> > volatile load and stores are very likely already atomic, yes, but
> what I
> > need is compare_exchange, exchange and friends to be atomic as
> well, but
> > only with regard to signal handlers. Note that this is a weaker
> > guarantee than normal accesses, that, on some architectures can be
> > implemented with significantly less expensive code sequences.
>
> On some architectures there are no dedicated CAS or exchange
> instructions. The atomic operations can be implemented in terms of
> LL/SC
> pair. If you translate LL/SC to normal loads and stores with your
> proposed memory model arguments then the operations won't be atomic wrt
> a signal handler. If not then you don't win anything.
>
>
> Mapping memory_order_signal_* to plain memory_order_* would be a
> conforming implementation although suboptimal: I would expect at the
> very least that on LL/SC architectures the signal variants could be
> implemented by omitting the processor memory barriers required to
> implement stricter ordering than consume.
Ok, so you could save fence instructions. That could be useful, I guess.
For the record you could achieve the same with memory_order_relaxed and
atomic_signal_fences around the operation.
> > I could get the same functionality by writing inline asm, but I
> would
> > prefer a standard implementation. As we already have
> > atomic_signal_fence, the need has already been recognised.
>
> atomic_signal_fence does not constitute an operation, it simply acts as
> a compiler barrier.
>
>
> I did not say it is an operation, but it has very specific semantics
> that show that the designers of the C++ memory model did acknowledge the
> need of more exotic synchronization scenarios (the intended use case of
> atomic_signal_fence is not really synchronization with signals).
The acknowledgement may be there but my point was that
atomic_signal_fence is very different from what you propose. The
function has a well-formed behavior regardless of any factors, like
lock-free property of atomic<>. Your proposed change does not have that
quality.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
.
Author: Giovanni Piero Deretta <gpderetta@gmail.com>
Date: Thu, 1 Oct 2015 00:56:38 +0100
Raw View
--001a11401fcc6972840520ffae29
Content-Type: text/plain; charset=UTF-8
On 30 Sep 2015 7:16 pm, "Andrey Semashev" <andrey.semashev@gmail.com> wrote:
>
> On 30.09.2015 20:00, Giovanni Piero Deretta wrote:
>>
>> On Wednesday, September 30, 2015 at 5:38:05 PM UTC+1, Andrey Semashev
wrote:
>>
>> On 30.09.2015 19:13, Giovanni Piero Deretta wrote:
>> > On Wednesday, September 30, 2015 at 4:15:43 PM UTC+1, Andrey
>> Semashev wrote:
>> >
>> > I think, what you propose is very close to marking a regular
>> variable
>> > volatile. It doesn't guarantee atomicity wrt signal handlers
>> in all
>> > cases, but the enhanced atomic<> as you propose it won't be
>> able to
>> > provide any stronger guarantees. In particular, I'm thinking
>> about the
>> > case when atomic<> has to emulate atomicity with a lock pool.
>> In this
>> > case operations on the atomic value are not actually atomic
>> and a
>> > signal
>> > handler can be invoked in the middle of an operation.
>> >
>> > If atomic<> is lock-free (i.e. natively supported by the
>> hardware),
>> > then
>> > I see no reason for the compiler to make volatile accesses to
>> the same
>> > type non-atomic (again, in the scope of a single thread).
>> >
>> >
>> > volatile load and stores are very likely already atomic, yes, but
>> what I
>> > need is compare_exchange, exchange and friends to be atomic as
>> well, but
>> > only with regard to signal handlers. Note that this is a weaker
>> > guarantee than normal accesses, that, on some architectures can be
>> > implemented with significantly less expensive code sequences.
>>
>> On some architectures there are no dedicated CAS or exchange
>> instructions. The atomic operations can be implemented in terms of
>> LL/SC
>> pair. If you translate LL/SC to normal loads and stores with your
>> proposed memory model arguments then the operations won't be atomic
wrt
>> a signal handler. If not then you don't win anything.
>>
>>
>> Mapping memory_order_signal_* to plain memory_order_* would be a
>> conforming implementation although suboptimal: I would expect at the
>> very least that on LL/SC architectures the signal variants could be
>> implemented by omitting the processor memory barriers required to
>> implement stricter ordering than consume.
>
>
> Ok, so you could save fence instructions. That could be useful, I guess.
For the record you could achieve the same with memory_order_relaxed and
atomic_signal_fences around the operation.
>
On some architectures all RMW imply a full barrier, so relaxed doesn't
help. Even on those some architectures were relaxed is really unfenced, an
even lighter weight implementation which doesn't need LL/SC is possible
with OS assist.
Also using explicit fences is, at least for me, always awkward.
-- gpd
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--001a11401fcc6972840520ffae29
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<p dir=3D"ltr"><br>
On 30 Sep 2015 7:16 pm, "Andrey Semashev" <<a href=3D"mailto:a=
ndrey.semashev@gmail.com">andrey.semashev@gmail.com</a>> wrote:<br>
><br>
> On 30.09.2015 20:00, Giovanni Piero Deretta wrote:<br>
>><br>
>> On Wednesday, September 30, 2015 at 5:38:05 PM UTC+1, Andrey Semas=
hev wrote:<br>
>><br>
>> =C2=A0 =C2=A0 On 30.09.2015 19:13, Giovanni Piero Deretta wrote:<b=
r>
>> =C2=A0 =C2=A0 =C2=A0> On Wednesday, September 30, 2015 at 4:15:=
43 PM UTC+1, Andrey<br>
>> =C2=A0 =C2=A0 Semashev wrote:<br>
>> =C2=A0 =C2=A0 =C2=A0><br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0I think, what you prop=
ose is very close to marking a regular<br>
>> =C2=A0 =C2=A0 variable<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0volatile. It doesn'=
;t guarantee atomicity wrt signal handlers<br>
>> =C2=A0 =C2=A0 in all<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0cases, but the enhance=
d atomic<> as you propose it won't be<br>
>> =C2=A0 =C2=A0 able to<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0provide any stronger g=
uarantees. In particular, I'm thinking<br>
>> =C2=A0 =C2=A0 about the<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0case when atomic<&g=
t; has to emulate atomicity with a lock pool.<br>
>> =C2=A0 =C2=A0 In this<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0case operations on the=
atomic value are not actually atomic<br>
>> =C2=A0 =C2=A0 and a<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0signal<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0handler can be invoked=
in the middle of an operation.<br>
>> =C2=A0 =C2=A0 =C2=A0><br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0If atomic<> is l=
ock-free (i.e. natively supported by the<br>
>> =C2=A0 =C2=A0 hardware),<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0then<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0I see no reason for th=
e compiler to make volatile accesses to<br>
>> =C2=A0 =C2=A0 the same<br>
>> =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0type non-atomic (again=
, in the scope of a single thread).<br>
>> =C2=A0 =C2=A0 =C2=A0><br>
>> =C2=A0 =C2=A0 =C2=A0><br>
>> =C2=A0 =C2=A0 =C2=A0> volatile load and stores are very likely =
already atomic, yes, but<br>
>> =C2=A0 =C2=A0 what I<br>
>> =C2=A0 =C2=A0 =C2=A0> need is compare_exchange, exchange and fr=
iends to be atomic as<br>
>> =C2=A0 =C2=A0 well, but<br>
>> =C2=A0 =C2=A0 =C2=A0> only with regard to signal handlers. Note=
that this is a weaker<br>
>> =C2=A0 =C2=A0 =C2=A0> guarantee than normal accesses, that, on =
some architectures can be<br>
>> =C2=A0 =C2=A0 =C2=A0> implemented with significantly less expen=
sive code sequences.<br>
>><br>
>> =C2=A0 =C2=A0 On some architectures there are no dedicated CAS or =
exchange<br>
>> =C2=A0 =C2=A0 instructions. The atomic operations can be implement=
ed in terms of<br>
>> =C2=A0 =C2=A0 LL/SC<br>
>> =C2=A0 =C2=A0 pair. If you translate LL/SC to normal loads and sto=
res with your<br>
>> =C2=A0 =C2=A0 proposed memory model arguments then the operations =
won't be atomic wrt<br>
>> =C2=A0 =C2=A0 a signal handler. If not then you don't win anyt=
hing.<br>
>><br>
>><br>
>> Mapping memory_order_signal_*=C2=A0 to plain memory_order_* would =
be a<br>
>> conforming implementation although suboptimal:=C2=A0 I would expec=
t at the<br>
>> very least that on LL/SC architectures the signal variants could b=
e<br>
>> implemented by omitting the processor memory barriers required to<=
br>
>> implement stricter ordering than consume.<br>
><br>
><br>
> Ok, so you could save fence instructions. That could be useful, I gues=
s. For the record you could achieve the same with memory_order_relaxed and =
atomic_signal_fences around the operation.<br>
></p>
<p dir=3D"ltr">On some architectures all RMW imply a full barrier, so relax=
ed doesn't help. Even on those some architectures were relaxed is reall=
y unfenced,=C2=A0 an even lighter weight implementation which doesn't n=
eed LL/SC is possible with OS assist.</p>
<p dir=3D"ltr">Also using explicit fences is, at least for me, always awkwa=
rd. </p>
<p dir=3D"ltr">-- gpd</p>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--001a11401fcc6972840520ffae29--
.
Author: Giovanni Piero Deretta <gpderetta@gmail.com>
Date: Thu, 1 Oct 2015 01:05:20 +0100
Raw View
--001a11c333f28aff8a0520ffcd53
Content-Type: text/plain; charset=UTF-8
On 30 Sep 2015 6:16 pm, "Thiago Macieira" <thiago@macieira.org> wrote:
>
> On Wednesday 30 September 2015 10:00:29 Giovanni Piero Deretta wrote:
> > I did not say it is an operation, but it has very specific semantics
that
> > show that the designers of the C++ memory model did acknowledge the
need of
> > more exotic synchronization scenarios (the intended use case of
> > atomic_signal_fence is not really synchronization with signals).
>
> It's not? What is it for? And why does it have "signal" in the name?
It is really a compiler memory barrier. But in the C++ memory model there
is no notion of compiler reordering distinct from CPU released reordering.
The signal semantics are a brilliant trick to describe the desired property
within the confines of the C++MM.
While it can be genuinely used to synchronise with signal handlers, it is
often used in asymmetric synchronisation algorithms like some variants of
RCU, Lock reservation, work stealing, etc.
Is not a coincidence that the inventor of RCU has contributed to the
definition of C++MM.
-- gpd
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
--001a11c333f28aff8a0520ffcd53
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<p dir=3D"ltr"><br>
On 30 Sep 2015 6:16 pm, "Thiago Macieira" <<a href=3D"mailto:t=
hiago@macieira.org">thiago@macieira.org</a>> wrote:<br>
><br>
> On Wednesday 30 September 2015 10:00:29 Giovanni Piero Deretta wrote:<=
br>
> > I did not say it is an operation, but it has very specific semant=
ics that<br>
> > show that the designers of the C++ memory model did acknowledge t=
he need of<br>
> > more exotic synchronization scenarios (the intended use case of<b=
r>
> > atomic_signal_fence is not really synchronization with signals).<=
br>
><br>
> It's not? What is it for? And why does it have "signal" =
in the name?</p>
<p dir=3D"ltr">It is really a compiler memory barrier. But in the C++ memor=
y model there is no notion of compiler reordering distinct from CPU release=
d reordering. The signal semantics are a brilliant trick to describe the de=
sired property within the confines of the C++MM.</p>
<p dir=3D"ltr">While it can be genuinely used to synchronise with signal ha=
ndlers, it is often used in asymmetric synchronisation algorithms like some=
variants of RCU, Lock reservation, work stealing, etc. </p>
<p dir=3D"ltr">Is not a coincidence that the inventor of RCU has contribute=
d to the definition of C++MM.</p>
<p dir=3D"ltr">-- gpd</p>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--001a11c333f28aff8a0520ffcd53--
.