Thread

Topic: Guarantee of side-effect free assignment

Author: dave@boost-consulting.com (David Abrahams)
Date: Mon, 15 Oct 2007 18:17:25 GMT Raw View

on Sun Oct 07 2007, alfps-AT-start.no ("Alf P. Steinbach") wrote:

> * James Dennett:
>> Alf P. Steinbach wrote:
>>> I can find no such guarantee in the standard.  It
>>> seems the compiler is free to rewrite
>>>
>>>   p = new S();
>>>
>>> as
>>>
>>>   p = operator new( sizeof( S ) );
>>>   new( p ) S();
>>
>> What would grant the compiler freedom to deviate in such an
>> observable way from the semantics of the abstract machine (in
>> which, I hope it is clear, the rhs of an assignment is evaluated
>> before its result -- if there is one -- is assumed to be known).
>
> Tjat rule is not only not clear, it seems to be non-existent.
>
> Nobody has so far been able to come up with chapter and verse.

And nobody will.  It's like expecting to find specific language that
says this function does not return zero:

     int f()
     {
         return 1;
     }

The standard says what the language must do, and there is no need to
mark any specific deviation from that behavior as illegal.  If there
were, the standard would be infinitely large.

I hate the term "rewrite" to describe the kinds of optimizations being
considered here.  "Rewrites" are only allowed inasmuch as they are not
observable from the point of view of the language specification, which
essentially means the language has to do what the standard says, but
don't look at the generated assembly language and expect to see it
doing that in the way you expect.  Anyone who says something of the
form "the compiler is allowed to take this C++ code over here and
rewrite it as that C++ code over there" is lying unless the two pieces
of code actually are specified to have precisely identical effects.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Alf P. Steinbach" <alfps@start.no>
Date: Tue, 9 Oct 2007 21:26:06 CST Raw View

* Jiri Palecek:
>
> I think one of the problems with your solution with a randomly throwing

Non-provably not-throwing, not randomly throwing. ;-)


> constructor could be a compiler can still rewrite it like this:
>
>    Singleton* Singleton::instance()
>    {
>        if (pInstance == 0)
>        {
>            Lock lock;
>            if (pInstance == 0)
>            {
>              // we know pInstance is 0 here
>                pInstance =                      // Step 3
>                operator new(sizeof(Singleton)); // Step 1
>              try {
>                new (pInstance) Singleton;       // Step 2
>              } catch(...) {
>                delete (void*)pInstance;
>                pInstance=0;
>                throw;
>              }
>            }
>        }
>        return pInstance;
>    }
>
> And this transformation is legal (at least I think) as long as
> Singleton::Singleton() (even in case of an exception) doesn't call
> Singleton::instance() (and the allocation and deallocation function
> likewise). You could only see the pointer to the non-constructed from an
> async signal (but that's OK, because even the pointer is not sig_atomic_t,
> so can be anything in an async signal) or from a different thread, but the
> standard doesn't say anything about threads.

If pInstance is a non-local static it can be inspected from within the
Singleton constructor.  The rewrite will then have changed the effect of
a normal single-threaded program.  So at least in that case, it's not
allowed (assuming Greg's logic holds, which I think it does).

Which means that the conclusion that the assignment has to happen after
the constructor, is incompatible with the Meyers/Alexandrescu conclusion
that "there is no way to express this constraint in C or C++".


Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Wed, 10 Oct 2007 10:05:16 CST Raw View

On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
> James Kanze wrote:

    [...]
> >>> The construction of the object is a side effect.

> >> Can you justify that claim?

> > What else can it be?

> Part of the evalation of the expression, used to determine
> the result of the expression (as indeed it does).

That's not the usual definition.  An expression has a value and
side effects.  The value is what it returns (a pointer, in the
case of a new expression); the side effects are any changes in
the global program state (writes to memory, etc.).  Thus, in an
expression like "i = 42", the value is 42 (converted to the type
of i, if necessary); the write to i itself is a side effect.

> > Unless it's trivial (which doesn't really
> > concern us here), it writes to memory, etc.  Those are side
> > effects; the "value" of an expression has no side effects.

> >> Alf's example illustrates that calling the constructor is
> >> needed in order to know whether the expression has a value.
> >> The value can't be assigned from if it does not exist.

> > The "value" of a new expression is the pointer returned from the
> > allocator function.

> No; it is the address of a newly created object, according
> to the standard.  If a constructor throws, there is no such
> object, and the new expression does not have a value.

That's an argument I can follow.  It would be nice if the
standard actually said so, of course.  But I can't find it.

> > The compiler needs to know this in order to
> > call the constructor.

> > It is in every way like the expression ++i.

> I've talked about the differences.  Claiming that they don't
> exist doesn't make it so.

There are certainly differences, but not from the standard point
of view.  You've talked about the differences we all know exist,
but you've not presented anything in the standard which
recognizes them.

> > The modification of
> > i is a side effect; the value of the expression is the value
> > which will be written, and is available before the side effect
> > takes place (and must be available, for the side effect to take
> > place).

> >> The
> >> call to the constructor is *not* a side-effect of evaluating
> >> the expression; it's an inherent part of determining the value
> >> of that expression.

> > How can that be?  A constructor doesn't return a value.

> It doesn't need to return a value to affect the result of
> evaluating an expression.  There are plenty of ways in which
> even void-returning function can affect the value of an
> expression.

Of the expression in which they are called?  Not without
additional sequence points.

> >     [...]
> > You don't need the "as if" rule.  The standard explicitly states
> > that "side effects" can take place in any order, not necessarily
> > the order in which the sub-expressions which cause them are
> > evaluated.  And that applies to the abstract machine; the "as
> > if" rule is not necessary.

> Such an extended interpretation of the freedom to rearrange
> code would be most problematic; I can see why people are concerned,
> if they think implementors would really do such things.

I actually think that it is problematic.  In more cases than
just this.  But the committee doesn't seem to share my concerns.

(I'd like to see all freedom to reorder removed from the
abstract machine.)

> > The real question, of course, is whether calling the constructor
> > is a side effect.  To be frank, I don't really see how it can be
> > considered anything else, given the usual meaning of side
> > effect.  Could you elaborate why it isn't a "side effect".

> I believe I've tried to do so: it is *impossible* to determine
> the result of a new expression while ignorant of the body of
> a constructor which is used by that new expression.

The result of a new expression is a pointer.  You don't need the
constructor to get a valid pointer.  (You can't do much with the
pointer until the constructor has run, but it is a valid
pointer.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Wed, 10 Oct 2007 10:04:49 CST Raw View

On Oct 9, 4:53 pm, jdenn...@acm.org (James Dennett) wrote:
> James Kanze wrote:
> > On Oct 7, 9:36 pm, James Dennett <jdenn...@acm.org> wrote:
> >> Alf P. Steinbach wrote:
> >>> * James Dennett:
> >>>> Alf P. Steinbach wrote:
> >>>>> I can find no such guarantee in the standard.  It
> >>>>> seems the compiler is free to rewrite

> >>>>>   p = new S();

> >>>>> as

> >>>>>   p = operator new( sizeof( S ) );
> >>>>>   new( p ) S();

> >>>> What would grant the compiler freedom to deviate in such an
> >>>> observable way from the semantics of the abstract machine (in
> >>>> which, I hope it is clear, the rhs of an assignment is evaluated
> >>>> before its result -- if there is one -- is assumed to be known).

> >>> That rule is not only not clear, it seems to be non-existent.

> >> It seems to follow simple logic.  The value of an expression
> >> is determined by evaluating that expression.

> > The same simple logic says that in "j = ++i;", the ++i must
> > precede the assignment.  The standard quite clearly says that
> > this is *not* the case; that an expression has both a value and
> > side effects, and that the two are more or less independent.

> The standard explicitly says that inspecting i to see if it
> has been incremented (without an intervening sequence point)
> is undefined.  It is *that* which grants it latitude to
> deviate from the order which *is* specified.

Even if i and j have types "int volatile", and are observable
from the outside?  (The "traditional" interpretation of the C
standard has been that you shouldn't make more than one volatile
access per statement, since the ordering of the accesses within
the statement is not specified.)

It may not be what one wants to hear, but I think that the
standard is quite clear: "Except where noted, the order of
evaluation of operands of individual operators and
subexpressions of individual expressions, AND THE ORDER IN WHICH
SIDE EFFECTS TAKE PLACE, is unspecified."  Is there any other
reasonable interpretation in which side effects may take place
after the value of the subexpression has been used?

> > "At certain specified points in the execution sequence called sequence
> > points, all side effects of previous evaluations shall be complete and
> > no side effects of subsequent evaluations shall have taken place."
> > intro.execution/7]

> > we can conclude that - at the point when the new-expression makes its
> > function call - the assignment to p must a) either be over or b) must
> > have not yet begun. Well, since the value assigned to "p" is dependent
> > on the value returned by the new-expression, the only possibility is
> > that the assignment to "p" must fall in the evaluations "not-yet-
> > started" category. Therefore we are assured that when the function
> > call is made, p will still have its last-assigned value (the null

> >> In this case, there are sequence points as part of the
> >> evaluation of the new expression (in particular, at the start
> >> and end of the call to operator new, and at the start and end
> >> of the call to a constructor).

> > The call to the allocator function definitely introduces a
> > sequence point.  The call to the allocator function must also
> > precede the assignment, since the compiler can have no way of
> > knowing what the value of the expression is until this call has
> > occured.  The "call" to the "constructor" of a built-in type
> > (e.g. "new int(42)") does NOT introduce a sequence point, and at
> > any rate, the standard does not clearly say that the call to the
> > constructor is part of the "value" of the expression; it would
> > seem logically to be a side effect, so it's sequence points
> > aren't relevant.

> That's a separate case.

Not really.  It's only a special case because we know that the
initialization cannot raise an exception.

> The specified semantics still require
> that what is returned by new int(42) is a pointer to the newly
> created object (an int with value 42), but there is no way for
> conforming (single-threaded) code to see whether the assignment
> of the result in p = new int(42) occurs before or after the
> initialization of the int.  (That changes when C++ adds support
> for multi-threading, if p is accessible from other threads.)

Your argumentation continues to be based on the idea that the
reordering only takes place because of the as-if rule.  The
standard, however, specifically says that it may take place,
i.e. that the abstract machine may reorder as well.  (Otherwise,
the statement that the order side effects take place is
irrelevant.)

> The example using a constructor still demonstrates that -- in
> principle -- initialization is an inherent part of evaluating
> a new-expression, not a mere side-effect.

What does "inherent part" mean?  I'd certainly say that the
modification of the left hand side in an assignment is an
"inherent part" of the expression, but it's still a side effect.
It may only be a note (and thus non-normative), but I think that
the first paragraph of section 5 makes the intent very clear: "An
expression can result in a value and can cause side effects."
Any modification of the global state of the system is a side
effect, at least according to the definitions I'm aware of.

> >> So the issue is whether the assignment side-effect can be
> >> moved to before those sequence points.

> >> We agree that such a move is observable.  I claim that it
> >> violates the notion of evaluation of an expression (a notion
> >> so fundamental that it's not specified by 14882, as the
> >> relevant aspects of it are common to an entire field).

> > And how does this not apply to the case of "j = ++ i;"

> Because of explicit latitude granted by the standard to
> implementations in this case, making it impossible for
> conforming code to tell.  Abstractly the increment happens
> first; however, the standard mandates that it's undetectable
> if the implementation choose to defer the actual increment.

That statement is neither supported by the actual words in the
standard ("the order in which side effects take place is
unspecified"), nor by any of the traditional interpretations of
the C standard.  If i and j have type "int volatile", the order
is observable.

> > Whether
> > the modification of i occurs before or after the assignment to j
> > is potentially observable, e.g. through an asynchronous signal.

> Standard C++ doesn't support asynchronous signals.

Standard C certainly does; it doesn't provide a standard means
of generating them, and it doesn't require an implementation to
ever generate them, but it definitely recognizes their
existance, e.g. (   5.1.2.3/4): "When the processing of the
abstract machine is interrupted by receipt of a signal, only the
values of objects as of the previous sequence point may be
relied on. Objects that may be modified between the previous
sequence point and the next sequence point need not have
received their correct values yet."

I think standard C++ more or less includes this by reference.
The intent is definitely that "volatile" have the same meaning
as in C.

> > An expression can result in a value, and can cause side effects.
> > As far as I can tell, all that is required (of the abstract
> > machine) is that those side effects occur before the next
> > sequence point.  That is certainly the traditional
> > interpretation.

> That's a C++-specific definition of "side-effect".

It's actually more or less the definition from C.

> Which begs the question: is construction a mere "side effect"?

Which is really the issue, isn't it?  The usual definition of a
side effect is something which modifies the program state.  A
constructor certainly does that.  So does the allocator
function, for that matter.  The difference is that the allocator
function (and the standard speaks of it as being a function)
must be called before the value of the expression can be
determined, and calling the function introduces the necessary
sequence points so that any change in global state made within
the function must take place before the return from the
function.  If there were any way for the compiler to know what
the value of the new expression was before the allocator
function was called, it could also do that assignment before
calling the allocator function.  Since the result of the new
expression *is* the return value of the allocator function
converted to the target type, however, the allocator function
must be called before the value is used.  (I don't think the
wording in the orginal C or C++ standards even guaranteed this.
But basic causation does.)

I'll have a closer look at the wording which will replace this
in the next version of the standard.

> > The question is, of course, whether the call to the
> > constructor is a side effect, or whether it is a necessary
> > part of evaluating the value.  The most intuitive
> > interpretation would be that it is a side effect.

> I find that interpretation deeply counterintuitive, and indeed
> a violation of reason: the value does not even exist if the
> constructor throws.  Knowing the value is impossible without
> knowing whether the constructor throws: therefore, execution
> of the constructor is essential to evaluating the expression.

The "value" of a new expression is simply a pointer.  It must
"exist" for the compiler to call the constructor.

    [...]
> > Again, I refer to the expression "j = ++ i".  If i and j are
> > initially 0, the above analysis would mean that an asynchronous
> > signal could see: i==0 && j==0, i==1 && j==0 or i==1 && j==1,
> > but never i==0 && j==1.  This is contrary to the traditional
> > interpretation of what is allowed in C.

> But not to anything written in the standard, I think.

See above.  There are at least two relevant statements.  The
first, in the definition of expressions, is basically identical
in both standards: "The order [...] of side effects is
unspecified".  The second, at least in the C standard, makes it
clear that even volatile doesn't affect this.  The reordering is
definitely legal.

> >     [...]
> >> In other words, evaluation of expressions in C++ has two
> >> aspects: (1) determining the value of that expression, and
> >> (2) side-effects (which may modify state, or have otherwise
> >> consequences).  The side-effects can be reordered so long
> >> as the semantics of the abstract machine (i.e., the specification
> >> of what things mean in C++) is not violated.  Reordering an
> >> assignment to before the value to assign is evaluated is a
> >> violation of these semantics.

> > I think the above is somewhat of a misstatement.

> We disagree, of course.

> >     5/4 says
> > quite clearly: "Except where noted, the order of evaluation of
> > operands of individual operators and subexpressions of
> > indivitual expressions, AND THE ORDER IN WHICH SIDE-EFFECTS TAKE
> > PLACE, is unspecified."

> Indeed, and that's the backdrop for what I've been trying to
> explain.  The "Except where noted" is key: the definition of
> assignment is, to me, quite explicit in noting the order of
> operations.  Evidently it wasn't clear enough though.

I don't think it says anything about the actual order.  At
least, no more that is said for ++.  It says what a new
expression does, i.e. it defines 1) the value of the expression
(type T*, etc.), and 2) the side effects of the expression
(allocator called, memory initialized).  It doesn't say anywhere
that those side effects must be complete before the value is
considered available, nor even that the side effects don't obey
the usual rule that allows reordering.

An interesting question:

    int *volatile p = NULL ;
    int volatile  i = 42 ;
    int volatile  j ;
    p = new int( j = i ) ;

Is any order imposed for the writes to p and j?

> > This refers to the abstract machine:
> > side effects are not required to take place in the same order
> > the corresponding sub-expressions are evaluated, except where
> > noted.  What Alf and I can't find is where this is noted for the
> > side effect of calling a constructor (or executing the
> > initialization of a built-in type) in a new expression.

> Probably an oversight -- one of many things that seemed so
> obvious as to not need explicit text, but that experience
> has shown to benefit from more clarity.

Agreed.  It may have been clear in the minds of the authors, but
in a standard, every i must be dotted.

> >     [...]
> >> Looking over a current draft would be useful then, to check
> >> that the changes more clearly express what we want.

> > I'll do so.  It's possible that the issue has been addressed;
> > it's (vaguely) related to the problem of double checked locking.
> > (Except that if it is addressed in that context, the tendency
> > would be to say that it isn't guaranteed.)

> In the presence of threads, much more becomes observable,
> and the issue is hugely more complicated.  Hopefully the work
> done on the C++0x memory model will be sufficient for the MT
> case, and hence clearly sufficient for the single threaded
> situation.

Except that in the multi-threaded model, it is explicit that
other threads may see writes in a different order unless some
synchronization primitives intervene.  In other words, even if
the compiler generates the call to the constructor (and the
writes it contains) before the write to the pointer in the
assignment, there is no guarantee that another thread will see
them in that order.  (This is a well known problem, see the
issues surrounding double checked locking.  And why it doesn't
work, even in Java, where the order is rigorously guaranteed.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Wed, 10 Oct 2007 10:04:39 CST Raw View

On Oct 8, 9:26 pm, "Alf P. Steinbach" <al...@start.no> wrote:
> * Greg Herlihy:

> > The program reaches plenty of sequence points between the evaluation
> > of the new-expression and the completion of the assignment operation.
> > As the Standard itself points out, a new expression makes a function
> > call (actually, several of them). And a C++ program arrives at a
> > sequence point whenever a called  function is entered - and arrives at
> > another upon exit. So, based on the Standard's definition of a
> > sequence point:

> > "At certain specified points in the execution sequence called sequence
> > points, all side effects of previous evaluations shall be complete and
> > no side effects of subsequent evaluations shall have taken place."
> > intro.execution/7]

> > we can conclude that - at the point when the new-expression makes its
> > function call - the assignment to p must a) either be over or b) must
> > have not yet begun. Well, since the value assigned to "p" is dependent
> > on the value returned by the new-expression, the only possibility is
> > that the assignment to "p" must fall in the evaluations "not-yet-
> > started" category. Therefore we are assured that when the function
> > call is made, p will still have its last-assigned value (the null
> > pointer constant).

> Here the dependency "on the value returned by the new-expression" is, as
> I understand it, really a dependency on the possible
> not-returning-a-value by a throwing constructor, and otherwise the
> dependency is only on the allocator function call (permitting the
> reordering, your option (a) for that function call).

I think that there is really only one issue: is calling the
constructor a side effect, or is it necessary to determine the
"value" of the expression.  While my gut feeling is that this is
very much like something like ++i, where the "value" is
independent of the write, some argument could be made that the
"value" is a pointer to a fully constructed object, and so
doesn't exist until the constructor has finished.  There's also
James Dennett's arguments, that (roughly speaking), the standard
doesn't really mean what it says when it says that side effects
can be re-ordered; that this is really a consequence of the fact
that no legal C++ code can detect the reordering, *except* here,
and so the freedom to re-order---a consequence of the "as-if"
rule---doesn't apply here.

> Assuming the above argument holds, then, the Meyers/Alexandrescu
> assumption[1] also holds, that the reordering shown below is permitted
> when the compiler can prove that the constructor doesn't throw.

>    Singleton* Singleton::instance()
>    {
>        if (pInstance == 0)
>        {
>            Lock lock;
>            if (pInstance == 0)
>            {
>                pInstance =                      // Step 3
>                operator new(sizeof(Singleton)); // Step 1
>                new (pInstance) Singleton;       // Step 2
>            }
>        }
>        return pInstance;
>    }

>    <quote>
>    there are conditions under which this transformation is legitimate.
>    Perhaps the simplest such condition is when a compiler can prove that
>    the Singleton constructor cannot throw (e.g., via post-inlining flow
>    analysis), but that is not the only condition. Some constructors that
>    throw can also have their instructions reordered such that this
>    problem arises.
>    </quote>

I'd be very cautious about bringing the Meyers/Alexandrescu
discussion in here.  They were interested solely with threading
issues.

> I'm now quoting in full what the article actually says because earlier
> in the thread I erred by paraphrasing the above quote, saying that it
> stated that the rewrite can "only" occur when the constructor is
> provably non-throwing, which is less permissive than the actual text.

What they're interested in, in that article, is the order of
operations as seen from another thread.  Even if the compiler
doesn't reorder, another thread is not guaranteed to see the
writes in the same order as they occur in the thread making
them.  That's a whole different can of worms, which the
standards committee is currently addressing.

> So that seems to leave an interesting possibility of safe
> double-checked locking pattern using a constructor that can't
> be proven by the compiler to not throw  --  which is easy
> enough to arrange, e.g. by dependency on dynamic data, e.g.
> checking a global initialized from a main() argument.

>    S::S()
>    {
>        // Some initialization here, then:
>        if( strcmp( ::dynData, ::someUuid ) == 0 ){ throw "never"; } }
>    }

> Yet, the Meyers/Alexandrescu article states categorically that

>    <quote>
>    DCLP will work only if steps 1 and 2  are completed before step 3 is
>    performed, but there is no way to express this constraint in C or
>    C++.
>    </quote>

Which really only applies to multithreaded code, sort of.  The
point is that the analysis the compiler does to check whether
the as-if rule applies only concerns the single thread, and---if
the compiler is Posix compliant as well as C++ compliant---the
threading primitives specified in the Posix standard.

The other point, of course, is that if the compiler is to
guarantee the order writes will be seen by other threads, it
must issue additional instructions: membar or fence or whatever.
Which slow execution down considerably---to my knowledge, none
do (even when volatile is used, which means that volatile
doesn't give the expected guarantees in a multi-threaded
environment either).

Anyway, I don't think there arguments are relevant here.

> "there is no way to express this constraint"  --  i.e. not even the
> S::S() constructor shown above is safe from willy-nilly assignment of
> result pointer before the constructor body's execution has finished.

> And the acknowledgments for the article include quite a few well-known
> people as reviewers, presumably catching a fundamental mistake like
> that, if it was a mistake: "Pre-publication drafts of this article were
> reviewed by Doug Lea, Kevlin Henney, Doug Schmidt, Chuck Allison, Petru
> Marginean, Hendrik Schober, David Brownell, Arch Robison, Bruce Leasure,
> and James Kanze. Their comments, insights, and explanations greatly
> improved the presentation of the paper and led us to our current
> understanding of DCLP, multithreading, instruction ordering, and
> compiler optimizations. After publication, we incorporated comments by
> Fedor Pikus, Al Stevens, Herb Sutter, and John Hicken."

It depends:-).  The article concerned a particular idiom for use
in multithreaded code.  Scott asked me to review it because of
the multithreaded issues (particularly the problems related to
processor reordering of reads and writes), and that's what I
concentrated on.  I imagine that this is true for most of the
other reviewers as well.

In particular, there was no real consideration (on my part, at
least) as to what might happen if the constructor threw an
exception.  The rewrite discussed above concerned the
application of the "as-if" rule by a perfectly clairvoyant
compiler, and apply regardless of the answer to this thread.

> So, I'm at a loss, since I now find your argument, that the
> assignment to result pointer has to happen after the
> constructor body's execution if the constructor can throw,
> quite convincing: I started writing a rebuttal but had to
> delete and write this instead. ;-)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Yechezkel Mett <ymett.on.usenet@gmail.com>
Date: Wed, 10 Oct 2007 10:03:33 CST Raw View

On Oct 8, 9:26 pm, "Alf P. Steinbach" <al...@start.no> wrote:
> * Greg Herlihy:
> > The program reaches plenty of sequence points between the evaluation
> > of the new-expression and the completion of the assignment operation.
> > As the Standard itself points out, a new expression makes a function
> > call (actually, several of them). And a C++ program arrives at a
> > sequence point whenever a called  function is entered - and arrives at
> > another upon exit. So, based on the Standard's definition of a
> > sequence point:
>
> > "At certain specified points in the execution sequence called sequence
> > points, all side effects of previous evaluations shall be complete and
> > no side effects of subsequent evaluations shall have taken place."
> > intro.execution/7]
>
> > we can conclude that - at the point when the new-expression makes its
> > function call - the assignment to p must a) either be over or b) must
> > have not yet begun. Well, since the value assigned to "p" is dependent
> > on the value returned by the new-expression, the only possibility is
> > that the assignment to "p" must fall in the evaluations "not-yet-
> > started" category. Therefore we are assured that when the function
> > call is made, p will still have its last-assigned value (the null
> > pointer constant).
>
> Here the dependency "on the value returned by the new-expression" is, as
> I understand it, really a dependency on the possible
> not-returning-a-value by a throwing constructor, and otherwise the
> dependency is only on the allocator function call (permitting the
> reordering, your option (a) for that function call).

I'm not sure what you mean by "not-returning-a-value by a throwing
constructor", constructors don't return values. I'll assume you mean
simply not returning.

The crux of the problem seems to be whether there really is any
dependency between returning the value and calling the constructor.
James Kanze claims (elsewhere in this thread) that calling the
constructor is a side effect by the standard's definition and that
there is no such dependency. His argument is not entirely
unconvincing.

>
> Assuming the above argument holds, then, the Meyers/Alexandrescu
> assumption[1] also holds, that the reordering shown below is permitted
> when the compiler can prove that the constructor doesn't throw.
>
>    Singleton* Singleton::instance()
>    {
>        if (pInstance == 0)
>        {
>            Lock lock;
>            if (pInstance == 0)
>            {
>                pInstance =                      // Step 3
>                operator new(sizeof(Singleton)); // Step 1
>                new (pInstance) Singleton;       // Step 2
>            }
>        }
>        return pInstance;
>    }
>
>    <quote>
>    there are conditions under which this transformation is legitimate.
>    Perhaps the simplest such condition is when a compiler can prove that
>    the Singleton constructor cannot throw (e.g., via post-inlining flow
>    analysis), but that is not the only condition. Some constructors that
>    throw can also have their instructions reordered such that this
>    problem arises.
>    </quote>
>
> I'm now quoting in full what the article actually says because earlier
> in the thread I erred by paraphrasing the above quote, saying that it
> stated that the rewrite can "only" occur when the constructor is
> provably non-throwing, which is less permissive than the actual text.
>
> So that seems to leave an interesting possibility of safe double-checked
> locking pattern using a constructor that can't be proven by the compiler
> to not throw  --  which is easy enough to arrange, e.g. by dependency on
> dynamic data, e.g. checking a global initialized from a main() argument.
>
>    S::S()
>    {
>        // Some initialization here, then:
>        if( strcmp( ::dynData, ::someUuid ) == 0 ){ throw "never"; } }
>    }
>
> Yet, the Meyers/Alexandrescu article states categorically that
>
>    <quote>
>    DCLP will work only if steps 1 and 2  are completed before step 3 is
>    performed, but there is no way to express this constraint in C or
>    C++.
>    </quote>
>
> "there is no way to express this constraint"  --  i.e. not even the
> S::S() constructor shown above is safe from willy-nilly assignment of
> result pointer before the constructor body's execution has finished.

There is no way to express this constraint _explicitly_. You can
attempt to force the compiler to avoid the optimisation -- but as they
say later in the article
"In essence, you've just fired the opening salvo in a war of
optimization. Your
compiler wants to optimize. You don't want it to, at least not here.
But this is
not a battle you want to get into. Your foe is wiley and
sophisticated, imbued
with strategems dreamed up over decades by people who do nothing but
think
about this kind of thing all day long, day after day, year after year.
Unless you
write optimizing compilers yourself, they are way ahead of you."

Even if you're right that the value must not be written until after
the constructor call, the compiler could write the value and restore
the original value in the event of a throw. Even if you inspect the
value from the constructor, written in a different translation unit,
the compiler could inline the constructor at link time and give the
correct value. Eventually you might be able to go far enough to thwart
the compiler; but how far do you have to go to be sure? (And that's
ignoring hardware reorderings.)

Yechezkel Mett

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Wed, 10 Oct 2007 16:00:40 GMT Raw View

Hyman Rosen ha scritto:
>
> It should be abundantly clear by now that order of evaluation,
> including when side effects happen, must be specified exactly.
>

What about: "Object initialization is sequenced before any use of the
value of the new-expression in the enclosing expression."?

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: alfps@start.no ("Alf P. Steinbach")
Date: Wed, 10 Oct 2007 16:14:38 GMT Raw View

* James Kanze:
>
> I'd be very cautious about bringing the Meyers/Alexandrescu
> discussion in here.  They were interested solely with threading
> issues.

Yes.  What's interesting here is not the threading issues, but the
freedom the compiler has to reorder this particular expression, which in
the article is a premise for the threading issues discussion.  This
premise is not said to rely on any threading considerations.

Summary: some folks think the standard allows the reordering, some folks
think it doesn't, and the latest draft adds language that seemingly
forbids the reordering, but has already been contested in this thread.

Summary of the summary: hm. :-)

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: alfps@start.no ("Alf P. Steinbach")
Date: Wed, 10 Oct 2007 17:13:12 GMT Raw View

* Yechezkel Mett:
> On Oct 8, 9:26 pm, "Alf P. Steinbach" <al...@start.no> wrote:
>>
>> Yet, the Meyers/Alexandrescu article states categorically that
>>
>>    <quote>
>>    DCLP will work only if steps 1 and 2  are completed before step 3 is
>>    performed, but there is no way to express this constraint in C or
>>    C++.
>>    </quote>
>>
>> "there is no way to express this constraint"  --  i.e. not even the
>> S::S() constructor shown above is safe from willy-nilly assignment of
>> result pointer before the constructor body's execution has finished.
>
> There is no way to express this constraint _explicitly_.

No, that's not said, and the article goes on to consider some
(non-working) implicit ways and then concludes that section, "you need
to be able to specify a constraint on instruction ordering, and your
language gives you no way to do it"  --  implicitly or explicitly.

[snip]
> Even if you're right that the value must not be written until after
> the constructor call,

Hm, that wasn't my conclusion.  The problem I noted in this thread's
root article was that seemingly the current standard does /not/ require
that sequence, formally (practice is something else).  Now I find Greg's
argument quite convincing, as I wrote in the article you're replying to.
  But as I also wrote there, I'm at a loss, in part because the article
we're discussing has the opposite as a fundamental premise.

> the compiler could write the value and restore
> the original value in the event of a throw. Even if you inspect the
> value from the constructor, written in a different translation unit,
> the compiler could inline the constructor at link time and give the
> correct value. Eventually you might be able to go far enough to thwart
> the compiler; but how far do you have to go to be sure? (And that's
> ignoring hardware reorderings.)

This is exactly what Greg's argument covers, at least as I read it.  The
argument doesn't rely on any notions of cover-up, correction or as-if.
Only that (1) at each sequence point, all previous evaluations are
required to be completed and subsequent evaluations are required to not
have produced any side effects yet, and (2) the assignment action (which
might be to do nothing) is dependent on the constructor call if the
constructor can throw.  As I understand it this argument does not permit
an initial tentative assignment followed later by corrective action: at
the sequence points inside the new expression, the assignment action
must either already be completed, fully, or not yet have begun.

If tentative assignment + later correction is permitted, then the
argument seems to not hold.

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Jiri Palecek <jpalecek@web.de>
Date: Wed, 10 Oct 2007 13:19:18 CST Raw View

James Kanze wrote:

> On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
>> James Kanze wrote:
>
>     [...]
>> >>> The construction of the object is a side effect.
>
>> >> Can you justify that claim?
>
>> > What else can it be?
>
>> Part of the evalation of the expression, used to determine
>> the result of the expression (as indeed it does).
>
> That's not the usual definition.  An expression has a value and
> side effects.  The value is what it returns (a pointer, in the
> case of a new expression); the side effects are any changes in
> the global program state (writes to memory, etc.).  Thus, in an
> expression like "i = 42", the value is 42 (converted to the type
> of i, if necessary); the write to i itself is a side effect.

It is debatable whether constructor call is a side effect. If you look at
the definition, the only case that could possibly apply is ... modifying an
object ... but constructors don't _modify_ objects.

>> No; it is the address of a newly created object, according
>> to the standard.  If a constructor throws, there is no such
>> object, and the new expression does not have a value.
>
> That's an argument I can follow.  It would be nice if the
> standard actually said so, of course.  But I can't find it.

5.3.4/1 at the end

[snip]
>> > The real question, of course, is whether calling the constructor
>> > is a side effect.  To be frank, I don't really see how it can be
>> > considered anything else, given the usual meaning of side
>> > effect.  Could you elaborate why it isn't a "side effect".
>
>> I believe I've tried to do so: it is *impossible* to determine
>> the result of a new expression while ignorant of the body of
>> a constructor which is used by that new expression.
>
> The result of a new expression is a pointer.  You don't need the
> constructor to get a valid pointer.  (You can't do much with the
> pointer until the constructor has run, but it is a valid
> pointer.)

No, the result of new-expression is a pointer to an object. If the
constructor throws, the pointer will point to junk memory, not an object
(this is less likely) or be invalid (because the memory is automatically
freed, except in some cases). Can you find some clause in the standard that
allows pointers to junk memory or invalid non-null pointers as a result of
new-expression? If not you can never see such a thing as a result of an
assignment like p=new foo; from a c++ program. However, the actual location
_might_ get written the invalid value, as long as you can't see it. And
this is the problem of the original paper, because multithreaded programs
can see much more than the c++ semantics assumes.

Regards
    Jiri Palecek

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Yechezkel Mett <ymett.on.usenet@gmail.com>
Date: Wed, 10 Oct 2007 13:19:44 CST Raw View

On Oct 10, 7:13 pm, al...@start.no ("Alf P. Steinbach") wrote:
> * Yechezkel Mett:
>
> > On Oct 8, 9:26 pm, "Alf P. Steinbach" <al...@start.no> wrote:
>
> >> Yet, the Meyers/Alexandrescu article states categorically that
>
> >>    <quote>
> >>    DCLP will work only if steps 1 and 2  are completed before step 3 is
> >>    performed, but there is no way to express this constraint in C or
> >>    C++.
> >>    </quote>
>
> >> "there is no way to express this constraint"  --  i.e. not even the
> >> S::S() constructor shown above is safe from willy-nilly assignment of
> >> result pointer before the constructor body's execution has finished.
>
> > There is no way to express this constraint _explicitly_.
>
> No, that's not said, and the article goes on to consider some
> (non-working) implicit ways and then concludes that section, "you need
> to be able to specify a constraint on instruction ordering, and your
> language gives you no way to do it"  --  implicitly or explicitly.

My reading is: there is no way to tell the compiler you want to
constrain the instruction ordering; therefore you have to rely on that
following as a consequence of what you can tell the compiler, which
doesn't work because of the as-if rule.

> > Even if you're right that the value must not be written until after
> > the constructor call,
>
> Hm, that wasn't my conclusion.  The problem I noted in this thread's
> root article was that seemingly the current standard does /not/ require
> that sequence, formally (practice is something else).  Now I find Greg's
> argument quite convincing, as I wrote in the article you're replying to.
>   But as I also wrote there, I'm at a loss, in part because the article
> we're discussing has the opposite as a fundamental premise.
>
> > the compiler could write the value and restore
> > the original value in the event of a throw. Even if you inspect the
> > value from the constructor, written in a different translation unit,
> > the compiler could inline the constructor at link time and give the
> > correct value. Eventually you might be able to go far enough to thwart
> > the compiler; but how far do you have to go to be sure? (And that's
> > ignoring hardware reorderings.)
>
> This is exactly what Greg's argument covers, at least as I read it.  The
> argument doesn't rely on any notions of cover-up, correction or as-if.
> Only that (1) at each sequence point, all previous evaluations are
> required to be completed and subsequent evaluations are required to not
> have produced any side effects yet, and (2) the assignment action (which
> might be to do nothing) is dependent on the constructor call if the
> constructor can throw.  As I understand it this argument does not permit
> an initial tentative assignment followed later by corrective action: at
> the sequence points inside the new expression, the assignment action
> must either already be completed, fully, or not yet have begun.

I don't actually follow his argument (I don't see where he explains
why the assignment is dependant on the constructor call) but that's
besides the point; the Meyers/Alexandrescu article's point is based on
as-if. Whatever the standard requires is only required _in a single
thread_ and only _as-if_. Given the sequence I outlined above, within
a single thread it would be impossible to tell that the program was
deviating from Greg Herlihy's understanding of the standard's
requirements, therefore as-if is satisfied. From another thread the
reordering could be seen, breaking DCL..

> If tentative assignment + later correction is permitted, then the
> argument seems to not hold.

Anything is permitted, so long as you can't tell.

Yechezkel Mett

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Thu, 11 Oct 2007 14:49:37 GMT Raw View

James Kanze wrote:
> On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
>> James Kanze wrote:
>
>     [...]
>>>>> The construction of the object is a side effect.
>
>>>> Can you justify that claim?
>
>>> What else can it be?
>
>> Part of the evalation of the expression, used to determine
>> the result of the expression (as indeed it does).
>
> That's not the usual definition.  An expression has a value and
> side effects.  The value is what it returns (a pointer, in the
> case of a new expression); the side effects are any changes in
> the global program state (writes to memory, etc.).  Thus, in an
> expression like "i = 42", the value is 42 (converted to the type
> of i, if necessary); the write to i itself is a side effect.

Not everything fits into this classification scheme.  Calls
to functions are neither the value of the expression nor are
they side-effects (changes to global state, etc.), and yet
they are required to happen.  They may, in turn, contain code
which has side-effects, and those side-effects must obey the
language's sequencing rules (in particular, they must appear
complete before the function returns).

Consider

int f() { throw 7; }

int a;
a = f();

In this case, the assignment doesn't happen; the expression
f() has no value.  And yet executing the code in f() is not
a side-effect.  This is much like

p = new Type_With_Throwing_Constructor;

except that the function call is implicit in the latter case.

Can you cite anything from the standard to suggest that function
calls are defined to be "side effects"?

>>> Unless it's trivial (which doesn't really
>>> concern us here), it writes to memory, etc.  Those are side
>>> effects; the "value" of an expression has no side effects.
>
>>>> Alf's example illustrates that calling the constructor is
>>>> needed in order to know whether the expression has a value.
>>>> The value can't be assigned from if it does not exist.
>
>>> The "value" of a new expression is the pointer returned from the
>>> allocator function.
>
>> No; it is the address of a newly created object, according
>> to the standard.  If a constructor throws, there is no such
>> object, and the new expression does not have a value.
>
> That's an argument I can follow.  It would be nice if the
> standard actually said so, of course.  But I can't find it.

5.3.4/1 says "If the entity is a non-array object, the
new-expression returns [sic] a pointer to the object
created."  That seems fair explicit to me.  If no object
is created, there is not a result for the new-expression.

Or do you claim that an object is created even though
the constructor throws?

(The Standard is rather inconsistent in its use of the
term object; in some places it uses a C-like definition
as a region of storage, and in other places it reflects
the C++ object lifecycle's inclusion of constructors and
destructors.)

>>> The compiler needs to know this in order to
>>> call the constructor.
>
>>> It is in every way like the expression ++i.
>
>> I've talked about the differences.  Claiming that they don't
>> exist doesn't make it so.
>
> There are certainly differences, but not from the standard point
> of view.

Here we disagree (and I've pointed out the differences
which do exist from the standard point of view, and
tried to explain why they are indeed differences).  I
think we may well be talking past each other, which is
a shame as I have great respect for your arguments.

> You've talked about the differences we all know exist,
> but you've not presented anything in the standard which
> recognizes them.

Most of what I said was based directly on the standard; I
apologize for not citing specific text.

>>> The modification of
>>> i is a side effect; the value of the expression is the value
>>> which will be written, and is available before the side effect
>>> takes place (and must be available, for the side effect to take
>>> place).
>
>>>> The
>>>> call to the constructor is *not* a side-effect of evaluating
>>>> the expression; it's an inherent part of determining the value
>>>> of that expression.
>
>>> How can that be?  A constructor doesn't return a value.
>
>> It doesn't need to return a value to affect the result of
>> evaluating an expression.  There are plenty of ways in which
>> even void-returning function can affect the value of an
>> expression.
>
> Of the expression in which they are called?  Not without
> additional sequence points.

Maybe not, but the point to which I was responding was the
apparent claim that the reason why a constructor cannot affect
the value of an expression is that it does not return a value.
That was false.  (And, indeed, constructors do introduce
sequence points.  We agree on that, I believe; we just don't
agree yet on whether C++2003 orders those sequence points
before the assignment.)

>>>     [...]
>>> You don't need the "as if" rule.  The standard explicitly states
>>> that "side effects" can take place in any order, not necessarily
>>> the order in which the sub-expressions which cause them are
>>> evaluated.  And that applies to the abstract machine; the "as
>>> if" rule is not necessary.
>
>> Such an extended interpretation of the freedom to rearrange
>> code would be most problematic; I can see why people are concerned,
>> if they think implementors would really do such things.
>
> I actually think that it is problematic.  In more cases than
> just this.  But the committee doesn't seem to share my concerns.
>
> (I'd like to see all freedom to reorder removed from the
> abstract machine.)

I'm ambivalent on that subject.  On the one hand, it reduces
unpredictability and simplifies reasoning about programs.  On
the other hand, the clearest/simplest code usually sidesteps
the issue anyway (excepting some exception-related issues),
and changing this might break backwards compatibility on some
systems where non-portable code (or binary interfaces) depend
on existing order of evaluation.

>>> The real question, of course, is whether calling the constructor
>>> is a side effect.  To be frank, I don't really see how it can be
>>> considered anything else, given the usual meaning of side
>>> effect.  Could you elaborate why it isn't a "side effect".
>
>> I believe I've tried to do so: it is *impossible* to determine
>> the result of a new expression while ignorant of the body of
>> a constructor which is used by that new expression.
>
> The result of a new expression is a pointer.  You don't need the
> constructor to get a valid pointer.  (You can't do much with the
> pointer until the constructor has run, but it is a valid
> pointer.)

But it's not a pointer to 5.3.4/1's "object created" if
the constructor throws, unless you take the C-like view
of what object creation is.  (I wouldn't be opposed to a
DR regarding the standards split personality on this
subject.)

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Fri, 12 Oct 2007 10:01:44 CST Raw View

On Oct 10, 9:19 pm, Jiri Palecek <jpale...@web.de> wrote:
> James Kanze wrote:
> > On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
> >> James Kanze wrote:

> >     [...]
> >> >>> The construction of the object is a side effect.

> >> >> Can you justify that claim?

> >> > What else can it be?

> >> Part of the evalation of the expression, used to determine
> >> the result of the expression (as indeed it does).

> > That's not the usual definition.  An expression has a value and
> > side effects.  The value is what it returns (a pointer, in the
> > case of a new expression); the side effects are any changes in
> > the global program state (writes to memory, etc.).  Thus, in an
> > expression like "i = 42", the value is 42 (converted to the type
> > of i, if necessary); the write to i itself is a side effect.

> It is debatable whether constructor call is a side effect.

It certainly has side effects.  The question is whether it
affects the value---otherwise, it is a pure side effect.

> If you look at the definition, the only case that could
> possibly apply is ... modifying an object ... but constructors
> don't _modify_ objects.

The modify the underlying bytes of a (not yet fully constructed)
object.

> >> No; it is the address of a newly created object, according
> >> to the standard.  If a constructor throws, there is no such
> >> object, and the new expression does not have a value.

> > That's an argument I can follow.  It would be nice if the
> > standard actually said so, of course.  But I can't find it.

> 5.3.4/1 at the end

You mean: "If the entity is a non-array object, the
new-expression returns a pointer to the object created."  That
could be interpreted to mean what you say, yes.  I think that
you're reading too much into it, but it is an argument.  I'd
still like at least a statement from the committee that this was
the intended reading.


--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Fri, 12 Oct 2007 10:03:35 CST Raw View

On Oct 11, 4:49 pm, jdenn...@acm.org (James Dennett) wrote:
> James Kanze wrote:
> > On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
> >> James Kanze wrote:

> >     [...]
> >>>>> The construction of the object is a side effect.

> >>>> Can you justify that claim?

> >>> What else can it be?

> >> Part of the evalation of the expression, used to determine
> >> the result of the expression (as indeed it does).

> > That's not the usual definition.  An expression has a value and
> > side effects.  The value is what it returns (a pointer, in the
> > case of a new expression); the side effects are any changes in
> > the global program state (writes to memory, etc.).  Thus, in an
> > expression like "i = 42", the value is 42 (converted to the type
> > of i, if necessary); the write to i itself is a side effect.

> Not everything fits into this classification scheme.  Calls
> to functions are neither the value of the expression nor are
> they side-effects (changes to global state, etc.), and yet
> they are required to happen.

If the function has a return value, then that is the value of
the expression, and you have to call the function to get it.  If
the function is void, then it has a side effect, and must be
called before the next sequence point.

> They may, in turn, contain code which has side-effects, and
> those side-effects must obey the language's sequencing rules
> (in particular, they must appear complete before the function
> returns).

> Consider

> int f() { throw 7; }

> int a;
> a = f();

> In this case, the assignment doesn't happen; the expression
> f() has no value.

The expression f() has a value of type int.  The compiler is
required to call f() in order to obtain this value, since it is
used by the assignment.  In this case, the assignment cannot
take place until f() returns, since the value is not available
until then.

What actually happens in f() is irrelevant to the analysis of
"a = f();".

> And yet executing the code in f() is not
> a side-effect.  This is much like

> p = new Type_With_Throwing_Constructor;

> except that the function call is implicit in the latter case.

No it's not, since in the above case, the compiler knows the
"value" of the new expression before calling the constructor.
It's more like the i = ++ j example, where the compiler knows
the value to assign to i before writing it to j.

> Can you cite anything from the standard to suggest that function
> calls are defined to be "side effects"?

They aren't.  Their results are values, which are used in the
expression.

> >>> Unless it's trivial (which doesn't really
> >>> concern us here), it writes to memory, etc.  Those are side
> >>> effects; the "value" of an expression has no side effects.

> >>>> Alf's example illustrates that calling the constructor is
> >>>> needed in order to know whether the expression has a value.
> >>>> The value can't be assigned from if it does not exist.

> >>> The "value" of a new expression is the pointer returned from the
> >>> allocator function.

> >> No; it is the address of a newly created object, according
> >> to the standard.  If a constructor throws, there is no such
> >> object, and the new expression does not have a value.

> > That's an argument I can follow.  It would be nice if the
> > standard actually said so, of course.  But I can't find it.

> 5.3.4/1 says "If the entity is a non-array object, the
> new-expression returns [sic] a pointer to the object
> created."  That seems fair explicit to me.  If no object
> is created, there is not a result for the new-expression.

> Or do you claim that an object is created even though
> the constructor throws?

> (The Standard is rather inconsistent in its use of the
> term object; in some places it uses a C-like definition
> as a region of storage, and in other places it reflects
> the C++ object lifecycle's inclusion of constructors and
> destructors.)

That's what's bothering me a bit.  I rather think you're reading
more into that sentence than was intended.  But as written, it
does go a long way in supporting your view.  (Now if only it had
said "returns a pointer to the fully constructed object[...]",
there'd be no ambiguity possible.)

> >>> The compiler needs to know this in order to
> >>> call the constructor.

> >>> It is in every way like the expression ++i.

> >> I've talked about the differences.  Claiming that they don't
> >> exist doesn't make it so.

> > There are certainly differences, but not from the standard point
> > of view.

> Here we disagree (and I've pointed out the differences
> which do exist from the standard point of view, and
> tried to explain why they are indeed differences).  I
> think we may well be talking past each other, which is
> a shame as I have great respect for your arguments.

I'll admit that the only difference I can see is with regards to
observable behavior.  In the end, I'm very convinced that the
standards (both C and C++) allow side effects to be reordered in
the abstract machine, and not just as a result of the "as if"
rule.  And that calling the constructor, per se, is a side
effect.

    [...]
> >> It doesn't need to return a value to affect the result of
> >> evaluating an expression.  There are plenty of ways in which
> >> even void-returning function can affect the value of an
> >> expression.

> > Of the expression in which they are called?  Not without
> > additional sequence points.

> Maybe not, but the point to which I was responding was the
> apparent claim that the reason why a constructor cannot affect
> the value of an expression is that it does not return a value.
> That was false.  (And, indeed, constructors do introduce
> sequence points.  We agree on that, I believe; we just don't
> agree yet on whether C++2003 orders those sequence points
> before the assignment.)

The call to the constructor is definitely a sequence point.  As
is the return from the constructor.

Also, the effect of exceptions is perhaps a red herring.  You
can get the same effect without exceptions.  Just make the
pointer global, initialized with a null pointer, and have the
constructor look at it.  Is the constructor guaranteed to see a
null pointer?

What I will say is that I think that the standard should
guarantee this.  And that it is "expected" enough that any
implementation would be foolish to violate it, regardless of how
we interpret the standard.  And that the final two sentences in
   5.3.4/1, which you site above, are very close to convincing
me---the value of the new expression is a pointer to the object,
and it is probably meant, to the fully constructed object.

> >>>     [...]
> >>> You don't need the "as if" rule.  The standard explicitly states
> >>> that "side effects" can take place in any order, not necessarily
> >>> the order in which the sub-expressions which cause them are
> >>> evaluated.  And that applies to the abstract machine; the "as
> >>> if" rule is not necessary.

> >> Such an extended interpretation of the freedom to rearrange
> >> code would be most problematic; I can see why people are concerned,
> >> if they think implementors would really do such things.

> > I actually think that it is problematic.  In more cases than
> > just this.  But the committee doesn't seem to share my concerns.

> > (I'd like to see all freedom to reorder removed from the
> > abstract machine.)

> I'm ambivalent on that subject.  On the one hand, it reduces
> unpredictability and simplifies reasoning about programs.

Not just reasoning.  It means that tests actually mean
something, at least with regards to the values with which you
tested.  Undefined behavior, for any reason, is an anathma to
testing.

> On the other hand, the clearest/simplest code usually
> sidesteps the issue anyway (excepting some exception-related
> issues), and changing this might break backwards compatibility
> on some systems where non-portable code (or binary interfaces)
> depend on existing order of evaluation.

The question is: are there currently implementations
guaranteeing any specific order today?  If so, then we have to
take them into account.  (Of course, the "guarantee" might be
implicit.  And also of course, even if we do specify something
else, there's nothing to prevent such compilers from offering a
flag which generates the old order.)

> >>> The real question, of course, is whether calling the constructor
> >>> is a side effect.  To be frank, I don't really see how it can be
> >>> considered anything else, given the usual meaning of side
> >>> effect.  Could you elaborate why it isn't a "side effect".

> >> I believe I've tried to do so: it is *impossible* to determine
> >> the result of a new expression while ignorant of the body of
> >> a constructor which is used by that new expression.

> > The result of a new expression is a pointer.  You don't need the
> > constructor to get a valid pointer.  (You can't do much with the
> > pointer until the constructor has run, but it is a valid
> > pointer.)

> But it's not a pointer to 5.3.4/1's "object created" if
> the constructor throws, unless you take the C-like view
> of what object creation is.  (I wouldn't be opposed to a
> DR regarding the standards split personality on this
> subject.)

I think that it may be worth it.  Although perhaps adding the
words "fully constructed" before "object" would be a small
enough change, and considered consistent with the original
intent, to be taken as an editorial change.  In which case, it's
up to Pete.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Fri, 12 Oct 2007 10:20:00 CST Raw View

On Oct 10, 6:14 pm, al...@start.no ("Alf P. Steinbach") wrote:
> * James Kanze:

> > I'd be very cautious about bringing the Meyers/Alexandrescu
> > discussion in here.  They were interested solely with threading
> > issues.

> Yes.  What's interesting here is not the threading issues, but the
> freedom the compiler has to reorder this particular expression, which in
> the article is a premise for the threading issues discussion.  This
> premise is not said to rely on any threading considerations.

Except that in their example, the compiler was clearly allowed
to rearrange the order because of the "as if" rule.  They
weren't concerned about a possible exception, and didn't
consider that aspect.  And if the constructor can't throw, I
don't think that there's any disagreement that reordering is
legal, since the "as if" rule allows it, if nothing else does.
A compliant (single threaded) program can't tell.

> Summary: some folks think the standard allows the reordering,
> some folks think it doesn't, and the latest draft adds
> language that seemingly forbids the reordering, but has
> already been contested in this thread.

> Summary of the summary: hm. :-)

That we disagree as to what the standard actually says (or is
trying to say), but we agree (I think) as to what it should say.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Fri, 12 Oct 2007 10:20:37 CST Raw View

On Oct 10, 7:13 pm, al...@start.no ("Alf P. Steinbach") wrote:
> * Yechezkel Mett:

> > On Oct 8, 9:26 pm, "Alf P. Steinbach" <al...@start.no> wrote:

    [...]
> > the compiler could write the value and restore
> > the original value in the event of a throw. Even if you inspect the
> > value from the constructor, written in a different translation unit,
> > the compiler could inline the constructor at link time and give the
> > correct value. Eventually you might be able to go far enough to thwart
> > the compiler; but how far do you have to go to be sure? (And that's
> > ignoring hardware reorderings.)

> This is exactly what Greg's argument covers, at least as I read it.  The
> argument doesn't rely on any notions of cover-up, correction or as-if.
> Only that (1) at each sequence point, all previous evaluations are
> required to be completed and subsequent evaluations are required to not
> have produced any side effects yet, and (2) the assignment action (which
> might be to do nothing) is dependent on the constructor call if the
> constructor can throw.

That's where his argument breaks down.  The assignment action is
dependent on the *value* of the new expression, but not its side
effects.  If we were assigning a user defined type, where the
assignment was a function, then there would be a sequence point
at the assignment, which would require that all of the side
effects of the previous expressions have occured.  But in the
case of pointer assignment, there is no such sequence point;
according to the current standard, at least, there is no
ordering between the assignment statement and any side effects
in its operands.

The only real question that I can see is whether the constructor
is a side effect, or whether it is a necessary part of obtaining
the value.  I find it hard to consider it as anything other than
a side effect (much like writing the updated value in i++), but
I can understand some disagreement on this point.

> As I understand it this argument does not permit
> an initial tentative assignment followed later by corrective action: at
> the sequence points inside the new expression, the assignment action
> must either already be completed, fully, or not yet have begun.

I don't think that there's anything in the standard which
forbids tentative actions, which are undone later, but this
could easily be because no one in the committee considered the
possibility.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Sat, 13 Oct 2007 16:29:43 GMT Raw View

James Kanze wrote:
> On Oct 10, 9:19 pm, Jiri Palecek <jpale...@web.de> wrote:
>> James Kanze wrote:
>>> On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
>>>> James Kanze wrote:
>
>>>     [...]
>>>>>>> The construction of the object is a side effect.
>
>>>>>> Can you justify that claim?
>
>>>>> What else can it be?
>
>>>> Part of the evalation of the expression, used to determine
>>>> the result of the expression (as indeed it does).
>
>>> That's not the usual definition.  An expression has a value and
>>> side effects.  The value is what it returns (a pointer, in the
>>> case of a new expression); the side effects are any changes in
>>> the global program state (writes to memory, etc.).  Thus, in an
>>> expression like "i = 42", the value is 42 (converted to the type
>>> of i, if necessary); the write to i itself is a side effect.
>
>> It is debatable whether constructor call is a side effect.
>
> It certainly has side effects.  The question is whether it
> affects the value---otherwise, it is a pure side effect.

Again, I disagree with our classification of all aspects
of expression evaluation as either (a) side-effects or
(b) things that affect the value.  Calls to functions do
not fall into either category, and I don't think 14882
supports your classification scheme.

It would make little sense to me to classify the execution
of a function as a side-effect: that the function is called
has no observable consequence aside from imposing sequence
points around side-effects of expressions evaluated by that
function.  It's also at odds with the English meaning of the
term "side-effect" as I understand it.  Evaluation of a
function call expression *means* calling the function, not
guessing the result.  But maybe we should state that
explicitly to avoid some of these discussions.

>> If you look at the definition, the only case that could
>> possibly apply is ... modifying an object ... but constructors
>> don't _modify_ objects.
>
> The modify the underlying bytes of a (not yet fully constructed)
> object.

Indeed: that's side-effects within a function, which must
be completed between the sequence point at the start of
that function's execution and the sequence point at the
end of that function's execution.

The function call must be evaluated (which means calling
the function, with its associated sequence points) as part
of evaluation of the expression.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Sat, 13 Oct 2007 16:30:16 GMT Raw View

James Kanze wrote:
> On Oct 11, 4:49 pm, jdenn...@acm.org (James Dennett) wrote:
>> James Kanze wrote:
>>> On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:
>>>> James Kanze wrote:
>=20
>>>     [...]
>>>>>>> The construction of the object is a side effect.
>=20
>>>>>> Can you justify that claim?
>=20
>>>>> What else can it be?
>=20
>>>> Part of the evalation of the expression, used to determine
>>>> the result of the expression (as indeed it does).
>=20
>>> That's not the usual definition.  An expression has a value and
>>> side effects.  The value is what it returns (a pointer, in the
>>> case of a new expression); the side effects are any changes in
>>> the global program state (writes to memory, etc.).  Thus, in an
>>> expression like "i =3D 42", the value is 42 (converted to the type
>>> of i, if necessary); the write to i itself is a side effect.
>=20
>> Not everything fits into this classification scheme.  Calls
>> to functions are neither the value of the expression nor are
>> they side-effects (changes to global state, etc.), and yet
>> they are required to happen.
>=20
> If the function has a return value, then that is the value of
> the expression, and you have to call the function to get it.  If
> the function is void, then it has a side effect, and must be
> called before the next sequence point.

I can't think of a way to call a void function without
adding a sequence point.  ("You win this time, Mr Kanze.")

>> They may, in turn, contain code which has side-effects, and
>> those side-effects must obey the language's sequencing rules
>> (in particular, they must appear complete before the function
>> returns).
>=20
>> Consider
>=20
>> int f() { throw 7; }
>=20
>> int a;
>> a =3D f();
>=20
>> In this case, the assignment doesn't happen; the expression
>> f() has no value.
>=20
> The expression f() has a value of type int.=20

No, it does not.  It has type "int", but no value.

> The compiler is
> required to call f() in order to obtain this value, since it is
> used by the assignment.  In this case, the assignment cannot
> take place until f() returns, since the value is not available
> until then.

OK.

> What actually happens in f() is irrelevant to the analysis of
> "a =3D f();".

No, it's not.  It is intended to illustrate a broader point.
The relevance is that it shows that abstractly it is necessary
to evaluate a function called on the right hand side of an
assignment expression before performing the assignment.

>> And yet executing the code in f() is not
>> a side-effect.  This is much like
>=20
>> p =3D new Type_With_Throwing_Constructor;
>=20
>> except that the function call is implicit in the latter case.
>=20
> No it's not, since in the above case, the compiler knows the
> "value" of the new expression before calling the constructor.

Again, that's wrong.  The expression does not *have* a value
until the function call has happened, and may indeed not have
one at all.

> It's more like the i =3D ++ j example, where the compiler knows
> the value to assign to i before writing it to j.

I disagree, for the reasons stated.  The fact that the value
*is* affected by evaluation of the function is key.  Analogies
to code with no function evaluation are inapplicable.

>> Can you cite anything from the standard to suggest that function
>> calls are defined to be "side effects"?
>=20
> They aren't.  Their results are values, which are used in the
> expression.

OK.  Does this mean you'll stop referring to the function
calls as side-effects?

>>>>> Unless it's trivial (which doesn't really
>>>>> concern us here), it writes to memory, etc.  Those are side
>>>>> effects; the "value" of an expression has no side effects.
>=20
>>>>>> Alf's example illustrates that calling the constructor is
>>>>>> needed in order to know whether the expression has a value.
>>>>>> The value can't be assigned from if it does not exist.
>=20
>>>>> The "value" of a new expression is the pointer returned from the
>>>>> allocator function.
>=20
>>>> No; it is the address of a newly created object, according
>>>> to the standard.  If a constructor throws, there is no such
>>>> object, and the new expression does not have a value.
>=20
>>> That's an argument I can follow.  It would be nice if the
>>> standard actually said so, of course.  But I can't find it.
>=20
>> 5.3.4/1 says "If the entity is a non-array object, the
>> new-expression returns [sic] a pointer to the object
>> created."  That seems fair explicit to me.  If no object
>> is created, there is not a result for the new-expression.
>=20
>> Or do you claim that an object is created even though
>> the constructor throws?
>=20
>> (The Standard is rather inconsistent in its use of the
>> term object; in some places it uses a C-like definition
>> as a region of storage, and in other places it reflects
>> the C++ object lifecycle's inclusion of constructors and
>> destructors.)
>=20
> That's what's bothering me a bit.  I rather think you're reading
> more into that sentence than was intended.  But as written, it
> does go a long way in supporting your view.  (Now if only it had
> said "returns a pointer to the fully constructed object[...]",
> there'd be no ambiguity possible.)

I like that as a possible clarification, in that it is a
small change and does remove the ambiguity.

>>>>> The compiler needs to know this in order to
>>>>> call the constructor.
>=20
>>>>> It is in every way like the expression ++i.
>=20
>>>> I've talked about the differences.  Claiming that they don't
>>>> exist doesn't make it so.
>=20
>>> There are certainly differences, but not from the standard point
>>> of view.
>=20
>> Here we disagree (and I've pointed out the differences
>> which do exist from the standard point of view, and
>> tried to explain why they are indeed differences).  I
>> think we may well be talking past each other, which is
>> a shame as I have great respect for your arguments.
>=20
> I'll admit that the only difference I can see is with regards to
> observable behavior.  In the end, I'm very convinced that the
> standards (both C and C++) allow side effects to be reordered in
> the abstract machine, and not just as a result of the "as if"
> rule.  And that calling the constructor, per se, is a side
> effect.

And I'm convinced that the constructor call is not intended
to be viewed as a side-effect.  I can be persuaded by normative
references.

(In the presence of threads, things are very different.  I need
to study the working paper carefully to see what the current
status is.)

>     [...]
>>>> It doesn't need to return a value to affect the result of
>>>> evaluating an expression.  There are plenty of ways in which
>>>> even void-returning function can affect the value of an
>>>> expression.
>=20
>>> Of the expression in which they are called?  Not without
>>> additional sequence points.
>=20
>> Maybe not, but the point to which I was responding was the
>> apparent claim that the reason why a constructor cannot affect
>> the value of an expression is that it does not return a value.
>> That was false.  (And, indeed, constructors do introduce
>> sequence points.  We agree on that, I believe; we just don't
>> agree yet on whether C++2003 orders those sequence points
>> before the assignment.)
>=20
> The call to the constructor is definitely a sequence point.  As
> is the return from the constructor.
>=20
> Also, the effect of exceptions is perhaps a red herring.  You
> can get the same effect without exceptions.  Just make the
> pointer global, initialized with a null pointer, and have the
> constructor look at it.  Is the constructor guaranteed to see a
> null pointer?

I'd say yes, but the example with the exception provides a
clearer demonstration of *why* that is the case.

> What I will say is that I think that the standard should
> guarantee this.  And that it is "expected" enough that any
> implementation would be foolish to violate it, regardless of how
> we interpret the standard.  And that the final two sentences in
> =A75.3.4/1, which you site above, are very close to convincing
> me---the value of the new expression is a pointer to the object,
> and it is probably meant, to the fully constructed object.
>=20
>>>>>     [...]
>>>>> You don't need the "as if" rule.  The standard explicitly states
>>>>> that "side effects" can take place in any order, not necessarily
>>>>> the order in which the sub-expressions which cause them are
>>>>> evaluated.  And that applies to the abstract machine; the "as
>>>>> if" rule is not necessary.
>=20
>>>> Such an extended interpretation of the freedom to rearrange
>>>> code would be most problematic; I can see why people are concerned,
>>>> if they think implementors would really do such things.
>=20
>>> I actually think that it is problematic.  In more cases than
>>> just this.  But the committee doesn't seem to share my concerns.
>=20
>>> (I'd like to see all freedom to reorder removed from the
>>> abstract machine.)
>=20
>> I'm ambivalent on that subject.  On the one hand, it reduces
>> unpredictability and simplifies reasoning about programs.
>=20
> Not just reasoning.  It means that tests actually mean
> something, at least with regards to the values with which you
> tested.  Undefined behavior, for any reason, is an anathma to
> testing.

For comp.std.c++ we should maybe say that the uncertainty
in this case falls into two areas: some is more nondeterminism
("unspecified behavior"), but the unspecified rules can and do
also lead to code which has entirely undefined behavior.  Both
are problematic for testing.

>> On the other hand, the clearest/simplest code usually
>> sidesteps the issue anyway (excepting some exception-related
>> issues), and changing this might break backwards compatibility
>> on some systems where non-portable code (or binary interfaces)
>> depend on existing order of evaluation.
>=20
> The question is: are there currently implementations
> guaranteeing any specific order today?  If so, then we have to
> take them into account.  (Of course, the "guarantee" might be
> implicit.  And also of course, even if we do specify something
> else, there's nothing to prevent such compilers from offering a
> flag which generates the old order.)

I doubt that any of them formally document their behavior
with respect to evaluation order, so we'd only be breaking
their users' assumptions (which are subject to being broken
anyway).  It's just something we should do deliberately, if
we do it.  (Politically I'd be surprised if this happened,
though I think it would make C++ a better language to my
personal taste.)

>>>>> The real question, of course, is whether calling the constructor
>>>>> is a side effect.  To be frank, I don't really see how it can be
>>>>> considered anything else, given the usual meaning of side
>>>>> effect.  Could you elaborate why it isn't a "side effect".
>=20
>>>> I believe I've tried to do so: it is *impossible* to determine
>>>> the result of a new expression while ignorant of the body of
>>>> a constructor which is used by that new expression.
>=20
>>> The result of a new expression is a pointer.  You don't need the
>>> constructor to get a valid pointer.  (You can't do much with the
>>> pointer until the constructor has run, but it is a valid
>>> pointer.)
>=20
>> But it's not a pointer to 5.3.4/1's "object created" if
>> the constructor throws, unless you take the C-like view
>> of what object creation is.  (I wouldn't be opposed to a
>> DR regarding the standards split personality on this
>> subject.)
>=20
> I think that it may be worth it.  Although perhaps adding the
> words "fully constructed" before "object" would be a small
> enough change, and considered consistent with the original
> intent, to be taken as an editorial change.  In which case, it's
> up to Pete.

I'll see if I can't get some more opinions on this.  This isn't
the first time it's bothered me!

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Alf P. Steinbach" <alfps@start.no>
Date: Sat, 6 Oct 2007 19:07:24 CST Raw View

 From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
well as articles on the net about concurrency in C++, I'm reasonably
sure that given

   #include <iostream>
   #include <ostream>

   struct S { S(){ throw 123; }  int foo(){ return 666; } };

   int main()
   {
       S*  p = 0;

       try
       {
           p = new S();
       }
       catch( ... )
       {}

       if( p ) { std::cout << p->foo() << std::endl; }
   }

there is no guarantee that this code will not end up in a call to
p->foo() with an invalid pointer p, i.e., that might well happen.

Surely that couldn't have been the committee's intention?

Why isn't assignment treated as a function call?


Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Sun, 7 Oct 2007 00:21:28 GMT Raw View

Alf P. Steinbach wrote:
> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
> well as articles on the net about concurrency in C++, I'm reasonably
> sure that given
>
>   #include <iostream>
>   #include <ostream>
>
>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>
>   int main()
>   {
>       S*  p = 0;
>
>       try
>       {
>           p = new S();
>       }
>       catch( ... )
>       {}
>
>       if( p ) { std::cout << p->foo() << std::endl; }
>   }
>
> there is no guarantee that this code will not end up in a call to
> p->foo() with an invalid pointer p, i.e., that might well happen.
>
> Surely that couldn't have been the committee's intention?

I wouldn't imagine so.

> Why isn't assignment treated as a function call?

It doesn't need to be.  The assignment cannot occur until the
new value is known, which means that the "new" operator
has returned its result, which means that the object has been
constructed.  If the constructor throws, there's no value
from "new" above, and the assignment cannot occur; p will
remain null.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: alfps@start.no ("Alf P. Steinbach")
Date: Sun, 7 Oct 2007 02:05:15 GMT Raw View

* James Dennett:
> Alf P. Steinbach wrote:
>> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
>> well as articles on the net about concurrency in C++, I'm reasonably
>> sure that given
>>
>>   #include <iostream>
>>   #include <ostream>
>>
>>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>>
>>   int main()
>>   {
>>       S*  p = 0;
>>
>>       try
>>       {
>>           p = new S();
>>       }
>>       catch( ... )
>>       {}
>>
>>       if( p ) { std::cout << p->foo() << std::endl; }
>>   }
>>
>> there is no guarantee that this code will not end up in a call to
>> p->foo() with an invalid pointer p, i.e., that might well happen.
>>
>> Surely that couldn't have been the committee's intention?
>
> I wouldn't imagine so.

Thank you James, that's what I'm thinking too.


>> Why isn't assignment treated as a function call?
>
> It doesn't need to be.  The assignment cannot occur until the
> new value is known, which means that the "new" operator
> has returned its result, which means that the object has been
> constructed.  If the constructor throws, there's no value
> from "new" above, and the assignment cannot occur; p will
> remain null.

This, however, while I would like it to be true, while it is what one
intuitively expect, I can find no such guarantee in the standard.  It
seems the compiler is free to rewrite

   p = new S();

as

   p = operator new( sizeof( S ) );
   new( p ) S();

provided S doesn't define operator new (in which case that one would
have to be used for the allocation, but that's just details).

Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
rewrite can only occur when the compiler can prove that S() doesn't
throw; however, they give no formal justification for this assumption.


Cheers,

- Alf


Notes:
[1] <url: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf>

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Sun, 7 Oct 2007 07:24:00 GMT Raw View

Alf P. Steinbach wrote:
> * James Dennett:
>> Alf P. Steinbach wrote:
>>> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
>>> well as articles on the net about concurrency in C++, I'm reasonably
>>> sure that given
>>>
>>>   #include <iostream>
>>>   #include <ostream>
>>>
>>>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>>>
>>>   int main()
>>>   {
>>>       S*  p = 0;
>>>
>>>       try
>>>       {
>>>           p = new S();
>>>       }
>>>       catch( ... )
>>>       {}
>>>
>>>       if( p ) { std::cout << p->foo() << std::endl; }
>>>   }
>>>
>>> there is no guarantee that this code will not end up in a call to
>>> p->foo() with an invalid pointer p, i.e., that might well happen.
>>>
>>> Surely that couldn't have been the committee's intention?
>>
>> I wouldn't imagine so.
>
> Thank you James, that's what I'm thinking too.
>
>
>>> Why isn't assignment treated as a function call?
>>
>> It doesn't need to be.  The assignment cannot occur until the
>> new value is known, which means that the "new" operator
>> has returned its result, which means that the object has been
>> constructed.  If the constructor throws, there's no value
>> from "new" above, and the assignment cannot occur; p will
>> remain null.
>
> This, however, while I would like it to be true, while it is what one
> intuitively expect, I can find no such guarantee in the standard.  It
> seems the compiler is free to rewrite
>
>   p = new S();
>
> as
>
>   p = operator new( sizeof( S ) );
>   new( p ) S();

What would grant the compiler freedom to deviate in such an
observable way from the semantics of the abstract machine (in
which, I hope it is clear, the rhs of an assignment is evaluated
before its result -- if there is one -- is assumed to be known).

> provided S doesn't define operator new (in which case that one would
> have to be used for the allocation, but that's just details).
>
> Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
> rewrite can only occur when the compiler can prove that S() doesn't
> throw; however, they give no formal justification for this assumption.

I think the regular notions of expression evaluation cover
it; rewriting is allowed to some extent by the "as-if" rule,
but nothing grants license to make such observable changes
to the notion of evaluating an assignment f(a)=g(b).  The
assignment can't take place until the result of evaluating
g is known -- and if g throws, then it *has* no result.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: alfps@start.no ("Alf P. Steinbach")
Date: Sun, 7 Oct 2007 17:53:31 GMT Raw View

* James Dennett:
> Alf P. Steinbach wrote:
>> I can find no such guarantee in the standard.  It
>> seems the compiler is free to rewrite
>>
>>   p = new S();
>>
>> as
>>
>>   p = operator new( sizeof( S ) );
>>   new( p ) S();
>
> What would grant the compiler freedom to deviate in such an
> observable way from the semantics of the abstract machine (in
> which, I hope it is clear, the rhs of an assignment is evaluated
> before its result -- if there is one -- is assumed to be known).

Tjat rule is not only not clear, it seems to be non-existent.

Nobody has so far been able to come up with chapter and verse.

That includes me, James Kanze, and a lot of other knowledgable people.

Added to that, in the only Defect Report (slightly) relevant to this
issue, DR 222 [1], Andrew Koenig argued that

   "One way to deal with this issue [which is not exactly the above]
   would be to include built-in operators in the rule that puts a
   sequence point between evaluating a function's arguments and
   evaluating the function itself. However, that might be overkill:"

I.e., it might be "overkill" to make all built-in operators act formally
like function calls.

Whether Andrew thought it would also be "overkill" for just assignment
isn't clear, but since DR 222 is mainly about assignment we can conclude
that in his opinion at the time, the standard did not specify complete
evaluation of arguments before assignment effect.

Btw., I totally agree with you as to how this /should/ ideally be! :-)
I think! <g>

Cheers,

- Alf

Notes:
[1] <url: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#222>

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Sun, 7 Oct 2007 12:57:53 CST Raw View

On Oct 7, 2:21 am, jdenn...@acm.org (James Dennett) wrote:
> Alf P. Steinbach wrote:
> > From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
> > well as articles on the net about concurrency in C++, I'm reasonably
> > sure that given

> >   #include <iostream>
> >   #include <ostream>

> >   struct S { S(){ throw 123; }  int foo(){ return 666; } };

> >   int main()
> >   {
> >       S*  p = 0;

> >       try
> >       {
> >           p = new S();
> >       }
> >       catch( ... )
> >       {}

> >       if( p ) { std::cout << p->foo() << std::endl; }
> >   }

> > there is no guarantee that this code will not end up in a call to
> > p->foo() with an invalid pointer p, i.e., that might well happen.

> > Surely that couldn't have been the committee's intention?

> I wouldn't imagine so.

> > Why isn't assignment treated as a function call?

> It doesn't need to be.  The assignment cannot occur until the
> new value is known, which means that the "new" operator
> has returned its result, which means that the object has been
> constructed.

The construction of the object is a side effect.  The compiler
can use the new value as soon as it knows it.  All that the
standard requires is that side effects occur before the next
sequence point.

> If the constructor throws, there's no value from "new" above,
> and the assignment cannot occur; p will remain null.

How is this any different from the compiler generating the
actual assignment in ++i after it uses the value?

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jkherciueh@gmx.net
Date: Sun, 7 Oct 2007 18:00:01 GMT Raw View

James Dennett wrote:

> Alf P. Steinbach wrote:
>> * James Dennett:
>>> Alf P. Steinbach wrote:
[snip]
>> [...] It seems the compiler is free to rewrite
>>
>>   p = new S();
>>
>> as
>>
>>   p = operator new( sizeof( S ) );
>>   new( p ) S();
>
> What would grant the compiler freedom to deviate in such an
> observable way from the semantics of the abstract machine (in
> which, I hope it is clear, the rhs of an assignment is evaluated
> before its result -- if there is one -- is assumed to be known).

The argument runs as follows:  What we have is a composite expression

  p = new S()

and the left hand side is a new expression. The "assignment", i.e., the
change of the lvalue  p  is a side-effect of the assignment expression. The
construction of the object is a side-effect of the new expression. The
order of the two side-effects in the evaluation of the composite expression
is not specified because there is no sequence point that separates them.
That is what grants the compiler the freedom to rewrite the expression as

   p = operator new( sizeof( S ) );
   new( p ) S();

>> provided S doesn't define operator new (in which case that one would
>> have to be used for the allocation, but that's just details).
>>
>> Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
>> rewrite can only occur when the compiler can prove that S() doesn't
>> throw; however, they give no formal justification for this assumption.
>
> I think the regular notions of expression evaluation cover
> it; rewriting is allowed to some extent by the "as-if" rule,
> but nothing grants license to make such observable changes
> to the notion of evaluating an assignment f(a)=g(b).  The
> assignment can't take place until the result of evaluating
> g is known -- and if g throws, then it *has* no result.

This reasoning assumes that the result of an expression is not known (or or
does not exist) until the side-effects of evaluating the expression have
taken place. I think that guarantee is not made anywhere in the standard.


Best

Kai-Uwe Bux

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Sun, 7 Oct 2007 18:01:13 GMT Raw View

Alf P. Steinbach ha scritto:
> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
> well as articles on the net about concurrency in C++, I'm reasonably
> sure that given
>
>   #include <iostream>
>   #include <ostream>
>
>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>
>   int main()
>   {
>       S*  p = 0;
>
>       try
>       {
>           p = new S();
>       }
>       catch( ... )
>       {}
>
>       if( p ) { std::cout << p->foo() << std::endl; }
>   }
>
> there is no guarantee that this code will not end up in a call to
> p->foo() with an invalid pointer p, i.e., that might well happen.

The the latest draft N2369 effectively replaced the controversial
concept of "sequence points" with the new concept of "sequenced before"
(see 1.9/14 for details). Paragraph 5.17/1 has therefore been rewritten
and, if I interpret it correctly, it rules out this possibility: (with
emphasis added)

"In all cases, the assignment is *sequenced after* the value computation
of the right and left operands, and before the value computation of the
assignment expression."

As the assignment is sequenced after the the computation of the right
operand, it should not occur when such computation terminates
prematurely because of a exception.

>
> Surely that couldn't have been the committee's intention?
>
> Why isn't assignment treated as a function call?
>

With the new wording of 5.17/1, I don't see any need for that.

Just my opinion,

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Sun, 7 Oct 2007 13:07:02 CST Raw View

On Oct 7, 9:24 am, jdenn...@acm.org (James Dennett) wrote:
> Alf P. Steinbach wrote:

    [...]
> > This, however, while I would like it to be true, while it is what one
> > intuitively expect, I can find no such guarantee in the standard.  It
> > seems the compiler is free to rewrite

> >   p = new S();

> > as

> >   p = operator new( sizeof( S ) );
> >   new( p ) S();

> What would grant the compiler freedom to deviate in such an
> observable way from the semantics of the abstract machine (in
> which, I hope it is clear, the rhs of an assignment is evaluated
> before its result -- if there is one -- is assumed to be known).

The lack of any sequence points, and the fact that calling the
constructor is a "side effect"; the "value" of the expression
may be known before it is called.

> > provided S doesn't define operator new (in which case that one would
> > have to be used for the allocation, but that's just details).

> > Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
> > rewrite can only occur when the compiler can prove that S() doesn't
> > throw; however, they give no formal justification for this assumption.

> I think the regular notions of expression evaluation cover
> it; rewriting is allowed to some extent by the "as-if" rule,
> but nothing grants license to make such observable changes
> to the notion of evaluating an assignment f(a)=g(b).  The
> assignment can't take place until the result of evaluating
> g is known -- and if g throws, then it *has* no result.

In something like:

    j = ++ i ;

there is (intentionally) definitely no requirement that the
change in i's value (a side effect) occur before the assignment
to j.  And the C standard was carefully worded to make any code
which could detect the difference undefined behavior.  In the
case of operator new, in C++, we have a lot more complex side
effects---the constructor is called---but basically, up to a
point, the same rule applied: there was no way a conforming
program could tell, so it didn't matter.  Exceptions change
that.  Suppose, in the expression above, the modification to the
value of i *could* trigger an exception.  Would you say that we
are guaranteed that j is unmodified if this exception occured?
I don't think so.

The real problem, of course, is that exceptions really require
specifying the order of evaluation much more strictly.  Which I
very definitely favor, but which is a road I don't think the
committee wants to take.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: =?iso-8859-1?q?Daniel_Kr=FCgler?= <daniel.kruegler@googlemail.com>
Date: Sun, 7 Oct 2007 13:01:29 CST Raw View

On 7 Okt., 04:05, al...@start.no ("Alf P. Steinbach") wrote:
> Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
> rewrite can only occur when the compiler can prove that S() doesn't
> throw; however, they give no formal justification for this assumption.

I cannot give you a proof to your specific question, but
my reading of H.J. Boehm's paper

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html

implies to my that the compiler would not be allowed to
perform a code reordering which could end in a data race.
Since I'm not an expert in that domain, I would like
to encourage others to comment on this statement of
mine.

Greetings from Bremen,

Daniel

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Dennett <jdennett@acm.org>
Date: Sun, 7 Oct 2007 13:36:47 CST Raw View

Alf P. Steinbach wrote:
> * James Dennett:
>> Alf P. Steinbach wrote:
>>> I can find no such guarantee in the standard.  It
>>> seems the compiler is free to rewrite
>>>
>>>   p = new S();
>>>
>>> as
>>>
>>>   p = operator new( sizeof( S ) );
>>>   new( p ) S();
>>
>> What would grant the compiler freedom to deviate in such an
>> observable way from the semantics of the abstract machine (in
>> which, I hope it is clear, the rhs of an assignment is evaluated
>> before its result -- if there is one -- is assumed to be known).
>
> Tjat rule is not only not clear, it seems to be non-existent.

It seems to follow simple logic.  The value of an expression
is determined by evaluating that expression.  In this case,
there are sequence points as part of the evaluation of the
new expression (in particular, at the start and end of the
call to operator new, and at the start and end of the call
to a constructor).

So the issue is whether the assignment side-effect can be
moved to before those sequence points.

We agree that such a move is observable.  I claim that it
violates the notion of evaluation of an expression (a notion
so fundamental that it's not specified by 14882, as the
relevant aspects of it are common to an entire field).

Abstractly the evaluation of the RHS is performed in order
to determine the value to assign to the LHS.  The assignment
clearly, abstractly, cannot be performed until the value to
assign is known; that much is clear from the definition in
the standard of what assignment does.  We have the following
actions:
* Call operator new
* Call the constructor
* Assign the result of the expression to p
Side-effects within operator new and the constructor are
constrained by sequence points to taking effect after those
(special) functions are entered and before they return.
The assignment is (in terms of the abstract machine) constrained
by the definition of assignment, which requires that the value
of the expression be determined.

Your example provides an existence proof that the value of the
expression cannot be determined, even in principle, without
executing the constructor.  Nothing allows the compiler to
perform an assignment of something that is *not* the value of
the expression (and in the case of your example, the expression
has no value, as its evaluation terminates with an exception).

Clearly we could consider a different example, where a pointer
or reference to p was passed as an argument to a non-throwing
constructor.  The question then is marginally less clear-cut,
as the expression does have a value which will (at some time)
be assigned to p, and we could meaningfully discuss when that
assignment can happen.

The previous example still provides a clear demonstration of
the fact that the abstract machine requires evaluation of the
RHS expression to complete before the assignment can take
place.  The only way in which execution could happen differently
would be under freedoms to rearrange actions that occur with no
intervening sequence point.  In our case, however, there are
still at least four intervening sequence points in the abstract
machine, at the entry and exit of the operator new and the ctor,
and there is no text which permits moving the store across those
sequence points.  (This is a case where it's hard to point to
chapter and verse: simply, there's no reason at all why this
would be permitted, and it violates the very basis on which
the standard is written, which is that abstract execution
semantics are provided together with specific leeway under
the "as if" rule.)

However, clearly it's worthwhile to tighten up the standardese
to put an end to these debates.

> Nobody has so far been able to come up with chapter and verse.
>
> That includes me, James Kanze, and a lot of other knowledgable people.

So I see.

> Added to that, in the only Defect Report (slightly) relevant to this
> issue, DR 222 [1], Andrew Koenig argued that
>
>   "One way to deal with this issue [which is not exactly the above]
>   would be to include built-in operators in the rule that puts a
>   sequence point between evaluating a function's arguments and
>   evaluating the function itself. However, that might be overkill:"
>
> I.e., it might be "overkill" to make all built-in operators act formally
> like function calls.

Indeed, and the latest drafts use a different notion of
sequencing to avoid these discussions (and many more that
crop up when we start to consider multi-threading, as that
makes much more behavior potentially observable).

However, note that there is a difference also between the
"side-effects" of expression evaluation and the very function
calls which make up the expression evaluation.  i++ has the
side-effect of incrementing i; calling a function is not a
mere side-effect, as it affects the value of the expression.

In other words, evaluation of expressions in C++ has two
aspects: (1) determining the value of that expression, and
(2) side-effects (which may modify state, or have otherwise
consequences).  The side-effects can be reordered so long
as the semantics of the abstract machine (i.e., the specification
of what things mean in C++) is not violated.  Reordering an
assignment to before the value to assign is evaluated is a
violation of these semantics.

> Whether Andrew thought it would also be "overkill" for just assignment
> isn't clear, but since DR 222 is mainly about assignment we can conclude
> that in his opinion at the time, the standard did not specify complete
> evaluation of arguments before assignment effect.

I wouldn't want to put any words in Mr Koenig's mouth,
particularly not on the basis of DR222 (which covers a
fairly different subject in which there were no sequence
points)

> Btw., I totally agree with you as to how this /should/ ideally be! :-) I
> think! <g>
>
> Cheers,
>
> - Alf

Looking over a current draft would be useful then, to check
that the changes more clearly express what we want.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Sun, 7 Oct 2007 19:04:16 GMT Raw View

James Kanze wrote:
> On Oct 7, 9:24 am, jdenn...@acm.org (James Dennett) wrote:
>> Alf P. Steinbach wrote:
>
>     [...]
>>> This, however, while I would like it to be true, while it is what one
>>> intuitively expect, I can find no such guarantee in the standard.  It
>>> seems the compiler is free to rewrite
>
>>>   p = new S();
>
>>> as
>
>>>   p = operator new( sizeof( S ) );
>>>   new( p ) S();
>
>> What would grant the compiler freedom to deviate in such an
>> observable way from the semantics of the abstract machine (in
>> which, I hope it is clear, the rhs of an assignment is evaluated
>> before its result -- if there is one -- is assumed to be known).
>
> The lack of any sequence points, and the fact that calling the
> constructor is a "side effect"; the "value" of the expression
> may be known before it is called.

Here's where I disagree.  The value of the new-expression cannot
be known without calling the constructor; it is no mere side
effect.  In the given example, the new-expression does not have
a value, as its evaluation results in an exception.

>>> provided S doesn't define operator new (in which case that one would
>>> have to be used for the allocation, but that's just details).
>
>>> Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
>>> rewrite can only occur when the compiler can prove that S() doesn't
>>> throw; however, they give no formal justification for this assumption.
>
>> I think the regular notions of expression evaluation cover
>> it; rewriting is allowed to some extent by the "as-if" rule,
>> but nothing grants license to make such observable changes
>> to the notion of evaluating an assignment f(a)=g(b).  The
>> assignment can't take place until the result of evaluating
>> g is known -- and if g throws, then it *has* no result.
>
> In something like:
>
>     j = ++ i ;
>
> there is (intentionally) definitely no requirement that the
> change in i's value (a side effect) occur before the assignment
> to j.  And the C standard was carefully worded to make any code
> which could detect the difference undefined behavior.  In the
> case of operator new, in C++, we have a lot more complex side
> effects---the constructor is called---but basically, up to a
> point, the same rule applied: there was no way a conforming
> program could tell, so it didn't matter.  Exceptions change
> that.  Suppose, in the expression above, the modification to the
> value of i *could* trigger an exception.  Would you say that we
> are guaranteed that j is unmodified if this exception occured?
> I don't think so.

If ++i is a function call (so that it can thrown an exception
without undefined behavior) then yes, it is guaranteed by the
existing text of 14882:2003 combined with standard terminology
in the field.  There's not a value to assign if the RHS throws
an exception.

> The real problem, of course, is that exceptions really require
> specifying the order of evaluation much more strictly.

I don't think that's an issue, but it would be interesting to
know why you think it is.

> Which I
> very definitely favor, but which is a road I don't think the
> committee wants to take.

The changes from "sequence point" based to a more causal model
for terminology in the standard might address these concerns to
some extent.  You opinion on the new form of wording would be
valuable.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Sun, 7 Oct 2007 19:10:49 GMT Raw View

James Kanze wrote:
> On Oct 7, 2:21 am, jdenn...@acm.org (James Dennett) wrote:
>> Alf P. Steinbach wrote:
>>> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
>>> well as articles on the net about concurrency in C++, I'm reasonably
>>> sure that given
>
>>>   #include <iostream>
>>>   #include <ostream>
>
>>>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>
>>>   int main()
>>>   {
>>>       S*  p = 0;
>
>>>       try
>>>       {
>>>           p = new S();
>>>       }
>>>       catch( ... )
>>>       {}
>
>>>       if( p ) { std::cout << p->foo() << std::endl; }
>>>   }
>
>>> there is no guarantee that this code will not end up in a call to
>>> p->foo() with an invalid pointer p, i.e., that might well happen.
>
>>> Surely that couldn't have been the committee's intention?
>
>> I wouldn't imagine so.
>
>>> Why isn't assignment treated as a function call?
>
>> It doesn't need to be.  The assignment cannot occur until the
>> new value is known, which means that the "new" operator
>> has returned its result, which means that the object has been
>> constructed.
>
> The construction of the object is a side effect.

Can you justify that claim?  Alf's example illustrates that
calling the constructor is needed in order to know whether
the expression has a value.  The value can't be assigned from
if it does not exist.  The call to the constructor is *not*
a side-effect of evaluating the expression; it's an inherent
part of determining the value of that expression.

>  The compiler
> can use the new value as soon as it knows it.  All that the
> standard requires is that side effects occur before the next
> sequence point.

Right, but not relevant because what we're discussing is not
a side-effect (the only relevant side-effect is the assignment
of the result -- if there is one -- to the pointer p).

>> If the constructor throws, there's no value from "new" above,
>> and the assignment cannot occur; p will remain null.
>
> How is this any different from the compiler generating the
> actual assignment in ++i after it uses the value?

The quirk in that case is that there are additional rules
citing additional cases as undefined (if there would be
"too many" reads/writes without intervening sequence points).
There's simply no way to write code with defined behavior
that can observe the timing of the increment in ++i without
adding a sequence point such as by writing

void observe(int, int*) { ... }
.
observe(++i,&i);

and as soon as we do that, the sequence point changes things
so that the side-effect must occur before the call to the
function.  Naturally the inability to observe the result means
that the "as if" rule allows re-ordering.  That does not apply
in the original example in this thread.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Mon, 8 Oct 2007 08:46:25 CST Raw View

On Oct 7, 8:01 pm, AlbertoBarb...@libero.it (Alberto Ganesh Barbati)
wrote:
> Alf P. Steinbach ha scritto:
> > From discussions in [comp.lang.c++] and
> > [comp.lang.c++.moderated], as well as articles on the net
> > about concurrency in C++, I'm reasonably sure that given

> >   #include <iostream>
> >   #include <ostream>

> >   struct S { S(){ throw 123; }  int foo(){ return 666; } };

> >   int main()
> >   {
> >       S*  p = 0;

> >       try
> >       {
> >           p = new S();
> >       }
> >       catch( ... )
> >       {}

> >       if( p ) { std::cout << p->foo() << std::endl; }
> >   }

> > there is no guarantee that this code will not end up in a call to
> > p->foo() with an invalid pointer p, i.e., that might well happen.

> The the latest draft N2369 effectively replaced the controversial
> concept of "sequence points" with the new concept of "sequenced before"
> (see 1.9/14 for details). Paragraph 5.17/1 has therefore been rewritten
> and, if I interpret it correctly, it rules out this possibility: (with
> emphasis added)

> "In all cases, the assignment is *sequenced after* the value computation
> of the right and left operands, and before the value computation of the
> assignment expression."

> As the assignment is sequenced after the the computation of the right
> operand, it should not occur when such computation terminates
> prematurely because of a exception.

I'm not sure that that changes anything.  The problem is that
calling the constructor isn't part of the "value computation" of
a new expression, it is a side effect.  And unless side effects
are "sequenced before", we're back where we started.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Mon, 8 Oct 2007 08:42:44 CST Raw View

On Oct 7, 9:36 pm, James Dennett <jdenn...@acm.org> wrote:
> Alf P. Steinbach wrote:
> > * James Dennett:
> >> Alf P. Steinbach wrote:
> >>> I can find no such guarantee in the standard.  It
> >>> seems the compiler is free to rewrite

> >>>   p = new S();

> >>> as

> >>>   p = operator new( sizeof( S ) );
> >>>   new( p ) S();

> >> What would grant the compiler freedom to deviate in such an
> >> observable way from the semantics of the abstract machine (in
> >> which, I hope it is clear, the rhs of an assignment is evaluated
> >> before its result -- if there is one -- is assumed to be known).

> > That rule is not only not clear, it seems to be non-existent.

> It seems to follow simple logic.  The value of an expression
> is determined by evaluating that expression.

The same simple logic says that in "j = ++i;", the ++i must
precede the assignment.  The standard quite clearly says that
this is *not* the case; that an expression has both a value and
side effects, and that the two are more or less independent.

> In this case, there are sequence points as part of the
> evaluation of the new expression (in particular, at the start
> and end of the call to operator new, and at the start and end
> of the call to a constructor).

The call to the allocator function definitely introduces a
sequence point.  The call to the allocator function must also
precede the assignment, since the compiler can have no way of
knowing what the value of the expression is until this call has
occured.  The "call" to the "constructor" of a built-in type
(e.g. "new int(42)") does NOT introduce a sequence point, and at
any rate, the standard does not clearly say that the call to the
constructor is part of the "value" of the expression; it would
seem logically to be a side effect, so it's sequence points
aren't relevant.

> So the issue is whether the assignment side-effect can be
> moved to before those sequence points.

> We agree that such a move is observable.  I claim that it
> violates the notion of evaluation of an expression (a notion
> so fundamental that it's not specified by 14882, as the
> relevant aspects of it are common to an entire field).

And how does this not apply to the case of "j = ++ i;"  Whether
the modification of i occurs before or after the assignment to j
is potentially observable, e.g. through an asynchronous signal.

An expression can result in a value, and can cause side effects.
As far as I can tell, all that is required (of the abstract
machine) is that those side effects occur before the next
sequence point.  That is certainly the traditional
interpretation.

The question is, of course, whether the call to the constructor
is a side effect, or whether it is a necessary part of
evaluating the value.  The most intuitive interpretation would
be that it is a side effect.

> Abstractly the evaluation of the RHS is performed in order
> to determine the value to assign to the LHS.  The assignment
> clearly, abstractly, cannot be performed until the value to
> assign is known; that much is clear from the definition in
> the standard of what assignment does.  We have the following
> actions:
> * Call operator new
> * Call the constructor
> * Assign the result of the expression to p
> Side-effects within operator new and the constructor are
> constrained by sequence points to taking effect after those
> (special) functions are entered and before they return.
> The assignment is (in terms of the abstract machine) constrained
> by the definition of assignment, which requires that the value
> of the expression be determined.

Again, I refer to the expression "j = ++ i".  If i and j are
initially 0, the above analysis would mean that an asynchronous
signal could see: i==0 && j==0, i==1 && j==0 or i==1 && j==1,
but never i==0 && j==1.  This is contrary to the traditional
interpretation of what is allowed in C.

Of course, all of these requirements are being radically
changed, in order to take threading into account.  So the
actual guarantees will change (although hopefully with no effect
on currently legal single threaded C++ programs).

    [...]
> Indeed, and the latest drafts use a different notion of
> sequencing to avoid these discussions (and many more that
> crop up when we start to consider multi-threading, as that
> makes much more behavior potentially observable).

> However, note that there is a difference also between the
> "side-effects" of expression evaluation and the very function
> calls which make up the expression evaluation.  i++ has the
> side-effect of incrementing i; calling a function is not a
> mere side-effect, as it affects the value of the expression.

Only if the return value is used in the expression.  And a
constructor of a built-in type isn't a function call, either.

> In other words, evaluation of expressions in C++ has two
> aspects: (1) determining the value of that expression, and
> (2) side-effects (which may modify state, or have otherwise
> consequences).  The side-effects can be reordered so long
> as the semantics of the abstract machine (i.e., the specification
> of what things mean in C++) is not violated.  Reordering an
> assignment to before the value to assign is evaluated is a
> violation of these semantics.

I think the above is somewhat of a misstatement.     5/4 says
quite clearly: "Except where noted, the order of evaluation of
operands of individual operators and subexpressions of
indivitual expressions, AND THE ORDER IN WHICH SIDE-EFFECTS TAKE
PLACE, is unspecified."  This refers to the abstract machine:
side effects are not required to take place in the same order
the corresponding sub-expressions are evaluated, except where
noted.  What Alf and I can't find is where this is noted for the
side effect of calling a constructor (or executing the
initialization of a built-in type) in a new expression.

    [...]
> Looking over a current draft would be useful then, to check
> that the changes more clearly express what we want.

I'll do so.  It's possible that the issue has been addressed;
it's (vaguely) related to the problem of double checked locking.
(Except that if it is addressed in that context, the tendency
would be to say that it isn't guaranteed.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Mon, 8 Oct 2007 09:37:42 CST Raw View

On Oct 7, 9:04 pm, jdenn...@acm.org (James Dennett) wrote:
> James Kanze wrote:

    [...]
> > The lack of any sequence points, and the fact that calling the
> > constructor is a "side effect"; the "value" of the expression
> > may be known before it is called.

> Here's where I disagree.  The value of the new-expression cannot
> be known without calling the constructor; it is no mere side
> effect.

Why do you claim that?  If I write something like "new int(42)",
are you saying that the fact that 42 is written to the int is
not a side effect, but part of the "value" of the expression?

    [...]
> > In something like:

> >     j = ++ i ;

> > there is (intentionally) definitely no requirement that the
> > change in i's value (a side effect) occur before the assignment
> > to j.  And the C standard was carefully worded to make any code
> > which could detect the difference undefined behavior.  In the
> > case of operator new, in C++, we have a lot more complex side
> > effects---the constructor is called---but basically, up to a
> > point, the same rule applied: there was no way a conforming
> > program could tell, so it didn't matter.  Exceptions change
> > that.  Suppose, in the expression above, the modification to the
> > value of i *could* trigger an exception.  Would you say that we
> > are guaranteed that j is unmodified if this exception occured?
> > I don't think so.

> If ++i is a function call (so that it can thrown an exception
> without undefined behavior) then yes, it is guaranteed by the
> existing text of 14882:2003 combined with standard terminology
> in the field.  There's not a value to assign if the RHS throws
> an exception.

The problem is, of course, constructing any other case where the
effect can be seen by a conformant program, without invoking
undefined behavior.  The closest I can come is what can be seen
by an asynchronous signal which occurs in "j = ++i" (supposing
that j and i have the type sig_atomic_t volatile, to ensure
defined behavior according to the standard).  In that case, my
interpretation of the standard is that, assuming i and j
initialized to 0, it is fully conformant for the signal handler
to see i==0 and j==1.

> > The real problem, of course, is that exceptions really require
> > specifying the order of evaluation much more strictly.

> I don't think that's an issue, but it would be interesting to
> know why you think it is.

The allow interrupting an expression at a point where it is
partially executed.  To get a "snapshot", so to speak, within an
expression.

> > Which I
> > very definitely favor, but which is a road I don't think the
> > committee wants to take.

> The changes from "sequence point" based to a more causal model
> for terminology in the standard might address these concerns
> to some extent.  You opinion on the new form of wording would
> be valuable.

I will definitely look at it, although I don't have much time at
present.  If nothing else (but I hope to be able to react
before), I'll participate in the formulation of the French
comments on the CD---and threading, and everything tied to it,
are one of my major concerns.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Mon, 8 Oct 2007 09:37:40 CST Raw View

On Oct 7, 9:10 pm, jdenn...@acm.org (James Dennett) wrote:
> James Kanze wrote:
> > On Oct 7, 2:21 am, jdenn...@acm.org (James Dennett) wrote:
> >> Alf P. Steinbach wrote:
> >>> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
> >>> well as articles on the net about concurrency in C++, I'm reasonably
> >>> sure that given

> >>>   #include <iostream>
> >>>   #include <ostream>

> >>>   struct S { S(){ throw 123; }  int foo(){ return 666; } };

> >>>   int main()
> >>>   {
> >>>       S*  p = 0;

> >>>       try
> >>>       {
> >>>           p = new S();
> >>>       }
> >>>       catch( ... )
> >>>       {}

> >>>       if( p ) { std::cout << p->foo() << std::endl; }
> >>>   }

> >>> there is no guarantee that this code will not end up in a call to
> >>> p->foo() with an invalid pointer p, i.e., that might well happen.

> >>> Surely that couldn't have been the committee's intention?

> >> I wouldn't imagine so.

> >>> Why isn't assignment treated as a function call?

> >> It doesn't need to be.  The assignment cannot occur until the
> >> new value is known, which means that the "new" operator
> >> has returned its result, which means that the object has been
> >> constructed.

> > The construction of the object is a side effect.

> Can you justify that claim?

What else can it be?  Unless it's trivial (which doesn't really
concern us here), it writes to memory, etc.  Those are side
effects; the "value" of an expression has no side effects.

> Alf's example illustrates that calling the constructor is
> needed in order to know whether the expression has a value.
> The value can't be assigned from if it does not exist.

The "value" of a new expression is the pointer returned from the
allocator function.  The compiler needs to know this in order to
call the constructor.

It is in every way like the expression ++i.  The modification of
i is a side effect; the value of the expression is the value
which will be written, and is available before the side effect
takes place (and must be available, for the side effect to take
place).

> The
> call to the constructor is *not* a side-effect of evaluating
> the expression; it's an inherent part of determining the value
> of that expression.

How can that be?  A constructor doesn't return a value.

    [...]
> >> If the constructor throws, there's no value from "new" above,
> >> and the assignment cannot occur; p will remain null.

> > How is this any different from the compiler generating the
> > actual assignment in ++i after it uses the value?

> The quirk in that case is that there are additional rules
> citing additional cases as undefined (if there would be
> "too many" reads/writes without intervening sequence points).
> There's simply no way to write code with defined behavior
> that can observe the timing of the increment in ++i without
> adding a sequence point such as by writing

It affects the values you might see in the handler of an
asynchronous signal.  (And I know, there are a lot of weasel
words there, limiting what you can legitimately do.)

> void observe(int, int*) { ... }
> .
> observe(++i,&i);

> and as soon as we do that, the sequence point changes things
> so that the side-effect must occur before the call to the
> function.  Naturally the inability to observe the result means
> that the "as if" rule allows re-ordering.  That does not apply
> in the original example in this thread.

You don't need the "as if" rule.  The standard explicitly states
that "side effects" can take place in any order, not necessarily
the order in which the sub-expressions which cause them are
evaluated.  And that applies to the abstract machine; the "as
if" rule is not necessary.

The real question, of course, is whether calling the constructor
is a side effect.  To be frank, I don't really see how it can be
considered anything else, given the usual meaning of side
effect.  Could you elaborate why it isn't a "side effect".

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Greg Herlihy <greghe@pacbell.net>
Date: Mon, 8 Oct 2007 09:35:56 CST Raw View

On Oct 7, 11:00 am, jkherci...@gmx.net wrote:
> James Dennett wrote:
> > Alf P. Steinbach wrote:
> >> * James Dennett:
> >>> Alf P. Steinbach wrote:
> [snip]
> >> [...] It seems the compiler is free to rewrite
>
> >>   p = new S();
>
> >> as
>
> >>   p = operator new( sizeof( S ) );
> >>   new( p ) S();
>
> > What would grant the compiler freedom to deviate in such an
> > observable way from the semantics of the abstract machine (in
> > which, I hope it is clear, the rhs of an assignment is evaluated
> > before its result -- if there is one -- is assumed to be known).
>
> The argument runs as follows:  What we have is a composite expression
>
>   p = new S()
>
> and the left hand side is a new expression. The "assignment", i.e., the
> change of the lvalue  p  is a side-effect of the assignment expression. The
> construction of the object is a side-effect of the new expression. The
> order of the two side-effects in the evaluation of the composite expression
> is not specified because there is no sequence point that separates them.

The program reaches plenty of sequence points between the evaluation
of the new-expression and the completion of the assignment operation.
As the Standard itself points out, a new expression makes a function
call (actually, several of them). And a C++ program arrives at a
sequence point whenever a called  function is entered - and arrives at
another upon exit. So, based on the Standard's definition of a
sequence point:

"At certain specified points in the execution sequence called sequence
points, all side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken place."
intro.execution/7]

we can conclude that - at the point when the new-expression makes its
function call - the assignment to p must a) either be over or b) must
have not yet begun. Well, since the value assigned to "p" is dependent
on the value returned by the new-expression, the only possibility is
that the assignment to "p" must fall in the evaluations "not-yet-
started" category. Therefore we are assured that when the function
call is made, p will still have its last-assigned value (the null
pointer constant).

The C++ Standard provides an additional guarantee that no other part
of the expression (which made the function call) will be evaluated
while the function call is ongoing. So effectively, the evaluation of
the entire expression is suspended for  the duration of the function
call. Meaning that p will have its original value - not only upon
function entry and exit - but at every point in between. So the
compiler could even begin to evaluate the assignment until after the
function call returns a pinter to allocated and initialized S object.
But in this example, the function never does return such a pointer -
but instead throws an exception. In short, the evaluation of the
assignment expression is over before it has even begun.

> That is what grants the compiler the freedom to rewrite the expression as
>
>    p = operator new( sizeof( S ) );
>    new( p ) S();
>
> >> provided S doesn't define operator new (in which case that one would
> >> have to be used for the allocation, but that's just details).

Essentially, the argument being made is that because "p" may have an
intermediate value between sequence points - the compiler can rewrite
the assignment so that "p" has an intermediate value -at- a sequence
point. Specifically, the revised assignment operation has added a new
sequence point - a sequence point at which p has a value that it never
had at any sequence point in the original program's execution.

After all, at no sequence point in the original program did "p" ever
point to an allocated - but not-yet-initialized - S object. Instead
(assuming no exception was thrown) p went directly from pointing to
nothing at one sequence point to pointing to an allocated and
initialized S object at the very next sequence point. But by
separating the allocation of the S object from its initialization, the
revised assignment operation introduces a sequence point between the
two original sequence points. At this inserted sequence point, p can
be observed to point to an allocated - but not-yet-initialized - S
object. An observation not possible in the assignment as it was
originally written.

> >> Scott Meyers and Andrei Alexandrescu have assumed[1] that the above
> >> rewrite can only occur when the compiler can prove that S() doesn't
> >> throw; however, they give no formal justification for this assumption.
>
> > I think the regular notions of expression evaluation cover
> > it; rewriting is allowed to some extent by the "as-if" rule,
> > but nothing grants license to make such observable changes
> > to the notion of evaluating an assignment f(a)=g(b).  The
> > assignment can't take place until the result of evaluating
> > g is known -- and if g throws, then it *has* no result.
>
> This reasoning assumes that the result of an expression is not known (or or
> does not exist) until the side-effects of evaluating the expression have
> taken place. I think that guarantee is not made anywhere in the standard.

The "as-if" rule prevents a C++ compiler from adding sequence points
with program states that are not to be found anywhere in the original
program. Granted, the Standard could contain language expressly
forbidding C++ compilers from altering the semantics of perfectly well-
defined C++ programs, especially when those changes create
opportunities for undefined behavior that simply did not exist in the
program the way it had been written. But from my point of view, that
kind of a reassurance would somehow seem more more disquieting than
helpful.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: ymett <ymett.on.usenet@gmail.com>
Date: Mon, 8 Oct 2007 09:48:57 CST Raw View

On Oct 7, 9:10 pm, jdenn...@acm.org (James Dennett) wrote:
> James Kanze wrote:
> > On Oct 7, 2:21 am, jdenn...@acm.org (James Dennett) wrote:
> >> Alf P. Steinbach wrote:
> >>> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
> >>> well as articles on the net about concurrency in C++, I'm reasonably
> >>> sure that given
>
> >>>   #include <iostream>
> >>>   #include <ostream>
>
> >>>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>
> >>>   int main()
> >>>   {
> >>>       S*  p = 0;
>
> >>>       try
> >>>       {
> >>>           p = new S();
> >>>       }
> >>>       catch( ... )
> >>>       {}
>
> >>>       if( p ) { std::cout << p->foo() << std::endl; }
> >>>   }
.
> >> The assignment cannot occur until the
> >> new value is known, which means that the "new" operator
> >> has returned its result, which means that the object has been
> >> constructed.
>
> > The construction of the object is a side effect.
>
.
> The call to the constructor is *not*
> a side-effect of evaluating the expression; it's an inherent
> part of determining the value of that expression.

I'll quote the draft standard here (I don't know what the current
standard says).

1.9p13
"Evaluation of an expression (or a sub-expression) in general includes
both value computations (including determining the identity of an
object for lvalue evaluation and fetching a value previously assigned
to an object for rvalue evaluation) and initiation of side effects."

5.17p1
"In all cases, the assignment is sequenced after the value computation
of the right and left operands,"

So the question is, is the creation of the object a side-effect (as
James Kanze claims) or part of the value computation (as James Dennett
claims).

5.3.4p1
"If the entity is a non-array object, the new-expression returns a
pointer to the object created."

To have a pointer to an object you must have an object. Therefore
creating the object must be part of the value computation.

As further food for thought, consider

*(new C);

Is that illegal? If the construction is only a side-effect it should
be -- you can't dereference an object which is not yet constructed.

How about

*(new int(4)) = 5;

If construction is a side-effect, this will fall foul of 1.9p16:
"If a side effect on a scalar object is unsequenced relative to ... a
different side effect on the same scalar object ... the behavior is
undefined."

Yechezkel Mett

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Mon, 8 Oct 2007 11:31:55 CST Raw View

On Oct 7, 8:00 pm, jkherci...@gmx.net wrote:

    [...]
> The argument runs as follows:  What we have is a composite expression

>   p = new S()
>
> and the left hand side is a new expression. The "assignment", i.e., the
> change of the lvalue  p  is a side-effect of the assignment expression. The
> construction of the object is a side-effect of the new expression. The
> order of the two side-effects in the evaluation of the composite expression
> is not specified because there is no sequence point that separates them.
> That is what grants the compiler the freedom to rewrite the expression as

>    p = operator new( sizeof( S ) );
>    new( p ) S();

Exactly.  You've managed to say what I've been trying to say
more exactly, and in a lot less words.  The only real counter
argument I can think of is to say that the "value" of a new
expression is a pointer to a fully constructed object; that the
value doesn't exist until the constructor terminates.  IMHO,
that's really stretching it.

    [...]
> This reasoning assumes that the result of an expression is not
> known (or or does not exist) until the side-effects of
> evaluating the expression have taken place. I think that
> guarantee is not made anywhere in the standard.

Historically, the interpretation has always been that this is
*not* the case, although in most of the actual cases I can think
of, the standard goes out of its way to ensure that any code
which can actually tell involves undefined behavior.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: pete@versatilecoding.com (Pete Becker)
Date: Mon, 8 Oct 2007 16:48:25 GMT Raw View

On 2007-10-07 23:48:57 -1000, ymett <ymett.on.usenet@gmail.com> said:

>
> I'll quote the draft standard here (I don't know what the current
> standard says).
>

You have to be very careful in this area. The draft standard has major
changes in this area from what's in the current standard. Those changes
are designed to eliminate questions like theo ne under discussion.

--
  Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Alf P. Steinbach" <alfps@start.no>
Date: Mon, 8 Oct 2007 13:26:47 CST Raw View

* Greg Herlihy:
>
> The program reaches plenty of sequence points between the evaluation
> of the new-expression and the completion of the assignment operation.
> As the Standard itself points out, a new expression makes a function
> call (actually, several of them). And a C++ program arrives at a
> sequence point whenever a called  function is entered - and arrives at
> another upon exit. So, based on the Standard's definition of a
> sequence point:
>
> "At certain specified points in the execution sequence called sequence
> points, all side effects of previous evaluations shall be complete and
> no side effects of subsequent evaluations shall have taken place."
> intro.execution/7]
>
> we can conclude that - at the point when the new-expression makes its
> function call - the assignment to p must a) either be over or b) must
> have not yet begun. Well, since the value assigned to "p" is dependent
> on the value returned by the new-expression, the only possibility is
> that the assignment to "p" must fall in the evaluations "not-yet-
> started" category. Therefore we are assured that when the function
> call is made, p will still have its last-assigned value (the null
> pointer constant).

Here the dependency "on the value returned by the new-expression" is, as
I understand it, really a dependency on the possible
not-returning-a-value by a throwing constructor, and otherwise the
dependency is only on the allocator function call (permitting the
reordering, your option (a) for that function call).

Assuming the above argument holds, then, the Meyers/Alexandrescu
assumption[1] also holds, that the reordering shown below is permitted
when the compiler can prove that the constructor doesn't throw.

   Singleton* Singleton::instance()
   {
       if (pInstance == 0)
       {
           Lock lock;
           if (pInstance == 0)
           {
               pInstance =                      // Step 3
               operator new(sizeof(Singleton)); // Step 1
               new (pInstance) Singleton;       // Step 2
           }
       }
       return pInstance;
   }

   <quote>
   there are conditions under which this transformation is legitimate.
   Perhaps the simplest such condition is when a compiler can prove that
   the Singleton constructor cannot throw (e.g., via post-inlining flow
   analysis), but that is not the only condition. Some constructors that
   throw can also have their instructions reordered such that this
   problem arises.
   </quote>

I'm now quoting in full what the article actually says because earlier
in the thread I erred by paraphrasing the above quote, saying that it
stated that the rewrite can "only" occur when the constructor is
provably non-throwing, which is less permissive than the actual text.

So that seems to leave an interesting possibility of safe double-checked
locking pattern using a constructor that can't be proven by the compiler
to not throw  --  which is easy enough to arrange, e.g. by dependency on
dynamic data, e.g. checking a global initialized from a main() argument.

   S::S()
   {
       // Some initialization here, then:
       if( strcmp( ::dynData, ::someUuid ) == 0 ){ throw "never"; } }
   }

Yet, the Meyers/Alexandrescu article states categorically that

   <quote>
   DCLP will work only if steps 1 and 2  are completed before step 3 is
   performed, but there is no way to express this constraint in C or
   C++.
   </quote>

"there is no way to express this constraint"  --  i.e. not even the
S::S() constructor shown above is safe from willy-nilly assignment of
result pointer before the constructor body's execution has finished.

And the acknowledgments for the article include quite a few well-known
people as reviewers, presumably catching a fundamental mistake like
that, if it was a mistake: "Pre-publication drafts of this article were
reviewed by Doug Lea, Kevlin Henney, Doug Schmidt, Chuck Allison, Petru
Marginean, Hendrik Schober, David Brownell, Arch Robison, Bruce Leasure,
and James Kanze. Their comments, insights, and explanations greatly
improved the presentation of the paper and led us to our current
understanding of DCLP, multithreading, instruction ordering, and
compiler optimizations. After publication, we incorporated comments by
Fedor Pikus, Al Stevens, Herb Sutter, and John Hicken."

So, I'm at a loss, since I now find your argument, that the assignment
to result pointer has to happen after the constructor body's execution
if the constructor can throw, quite convincing: I started writing a
rebuttal but had to delete and write this instead. ;-)

Hm.


Cheers,

- Alf

Notes:
[1] <url: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf>.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: ark@acm.org ("Andrew Koenig")
Date: Mon, 8 Oct 2007 23:00:47 GMT Raw View

""Alf P. Steinbach"" <alfps@start.no> wrote in message
news:13gh5pe200j4m3c@corp.supernews.com...

> Added to that, in the only Defect Report (slightly) relevant to this
> issue, DR 222 [1], Andrew Koenig argued that
>
>   "One way to deal with this issue [which is not exactly the above]
>   would be to include built-in operators in the rule that puts a
>   sequence point between evaluating a function's arguments and
>   evaluating the function itself. However, that might be overkill:"
>
> I.e., it might be "overkill" to make all built-in operators act formally
> like function calls.
>
> Whether Andrew thought it would also be "overkill" for just assignment
> isn't clear, but since DR 222 is mainly about assignment we can conclude
> that in his opinion at the time, the standard did not specify complete
> evaluation of arguments before assignment effect.

I think I clarified that opinion in the sequel to what you quoted.  I gave
the following example:

    x[i++] = y;

and said that I didn't see any reason to require that i must be incremented
before the assignment to x[i] takes place.

For that matter, in

    x = y[i++];

I see no reason to require that i be incremented before the assignment
either.

However, in each of these examples, the side effect of incrementing i does
not affect the values that participate in the assignment.  In contrast, if
we write

    vector<int>* p = 0;
    p = new vector<int>(42);

I think it is reasonable to assume that

    new vector<int>(42)

must be evaluated before its result is assigned to p.  In other words,
allocating the memory that the vector will control is not a side effect--it
is an essential step in determining the value that will be assigned, and
that value is not meaningful until after that memory has been allocated.

Therefore, I think that the compiler should not be permitted to rewrite this
code in a way that gives a nonzero value to p if an exception is thrown.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Mon, 8 Oct 2007 23:09:32 GMT Raw View

James Kanze ha scritto:
> On Oct 7, 8:01 pm, AlbertoBarb...@libero.it (Alberto Ganesh Barbati)
> wrote:
>
>> As the assignment is sequenced after the the computation of the right
>> operand, it should not occur when such computation terminates
>> prematurely because of a exception.
>
> I'm not sure that that changes anything.  The problem is that
> calling the constructor isn't part of the "value computation" of
> a new expression, it is a side effect.  And unless side effects
> are "sequenced before", we're back where we started.
>

A side effects may not be part of the "value computation", but the
"initiation" of the side-effect *is* part of it, see 1.9/13.

Moreover 1.9/16 specifies that the execution of a function is always
sequenced (either before, after or indeterminately) with other
evaluation in the calling function. This means, that once the execution
of a function is initiated, it must be completed before any other
evaluation in the calling function. I interpret this as a guarantee that
the entire constructor body shall be executed before the assignment. Do
you see anything wrong with this interpretation?

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Tue, 9 Oct 2007 14:53:26 GMT Raw View

James Kanze wrote:
> On Oct 7, 9:04 pm, jdenn...@acm.org (James Dennett) wrote:
>> James Kanze wrote:
>
>     [...]
>>> The lack of any sequence points, and the fact that calling the
>>> constructor is a "side effect"; the "value" of the expression
>>> may be known before it is called.
>
>> Here's where I disagree.  The value of the new-expression cannot
>> be known without calling the constructor; it is no mere side
>> effect.
>
> Why do you claim that?  If I write something like "new int(42)",
> are you saying that the fact that 42 is written to the int is
> not a side effect, but part of the "value" of the expression?

No.  That's a difference case to the one I was discussing.
There's no way (in single-threaded code) to know whether
the int is initialized before its address is returned.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Tue, 9 Oct 2007 14:53:12 GMT Raw View

James Kanze wrote:
> On Oct 7, 9:36 pm, James Dennett <jdenn...@acm.org> wrote:
>> Alf P. Steinbach wrote:
>>> * James Dennett:
>>>> Alf P. Steinbach wrote:
>>>>> I can find no such guarantee in the standard.  It
>>>>> seems the compiler is free to rewrite
>=20
>>>>>   p =3D new S();
>=20
>>>>> as
>=20
>>>>>   p =3D operator new( sizeof( S ) );
>>>>>   new( p ) S();
>=20
>>>> What would grant the compiler freedom to deviate in such an
>>>> observable way from the semantics of the abstract machine (in
>>>> which, I hope it is clear, the rhs of an assignment is evaluated
>>>> before its result -- if there is one -- is assumed to be known).
>=20
>>> That rule is not only not clear, it seems to be non-existent.
>=20
>> It seems to follow simple logic.  The value of an expression
>> is determined by evaluating that expression.
>=20
> The same simple logic says that in "j =3D ++i;", the ++i must
> precede the assignment.  The standard quite clearly says that
> this is *not* the case; that an expression has both a value and
> side effects, and that the two are more or less independent.

The standard explicitly says that inspecting i to see if it
has been incremented (without an intervening sequence point)
is undefined.  It is *that* which grants it latitude to
deviate from the order which *is* specified.

>> In this case, there are sequence points as part of the
>> evaluation of the new expression (in particular, at the start
>> and end of the call to operator new, and at the start and end
>> of the call to a constructor).
>=20
> The call to the allocator function definitely introduces a
> sequence point.  The call to the allocator function must also
> precede the assignment, since the compiler can have no way of
> knowing what the value of the expression is until this call has
> occured.  The "call" to the "constructor" of a built-in type
> (e.g. "new int(42)") does NOT introduce a sequence point, and at
> any rate, the standard does not clearly say that the call to the
> constructor is part of the "value" of the expression; it would
> seem logically to be a side effect, so it's sequence points
> aren't relevant.

That's a separate case.  The specified semantics still require
that what is returned by new int(42) is a pointer to the newly
created object (an int with value 42), but there is no way for
conforming (single-threaded) code to see whether the assignment
of the result in p =3D new int(42) occurs before or after the
initialization of the int.  (That changes when C++ adds support
for multi-threading, if p is accessible from other threads.)

The example using a constructor still demonstrates that -- in
principle -- initialization is an inherent part of evaluating
a new-expression, not a mere side-effect.

>> So the issue is whether the assignment side-effect can be
>> moved to before those sequence points.
>=20
>> We agree that such a move is observable.  I claim that it
>> violates the notion of evaluation of an expression (a notion
>> so fundamental that it's not specified by 14882, as the
>> relevant aspects of it are common to an entire field).
>=20
> And how does this not apply to the case of "j =3D ++ i;"=20

Because of explicit latitude granted by the standard to
implementations in this case, making it impossible for
conforming code to tell.  Abstractly the increment happens
first; however, the standard mandates that it's undetectable
if the implementation choose to defer the actual increment.

> Whether
> the modification of i occurs before or after the assignment to j
> is potentially observable, e.g. through an asynchronous signal.

Standard C++ doesn't support asynchronous signals.

> An expression can result in a value, and can cause side effects.
> As far as I can tell, all that is required (of the abstract
> machine) is that those side effects occur before the next
> sequence point.  That is certainly the traditional
> interpretation.

That's a C++-specific definition of "side-effect".  Which begs
the question: is construction a mere "side effect"?

> The question is, of course, whether the call to the constructor
> is a side effect, or whether it is a necessary part of
> evaluating the value.  The most intuitive interpretation would
> be that it is a side effect.

I find that interpretation deeply counterintuitive, and indeed
a violation of reason: the value does not even exist if the
constructor throws.  Knowing the value is impossible without
knowing whether the constructor throws: therefore, execution
of the constructor is essential to evaluating the expression.

>> Abstractly the evaluation of the RHS is performed in order
>> to determine the value to assign to the LHS.  The assignment
>> clearly, abstractly, cannot be performed until the value to
>> assign is known; that much is clear from the definition in
>> the standard of what assignment does.  We have the following
>> actions:
>> * Call operator new
>> * Call the constructor
>> * Assign the result of the expression to p
>> Side-effects within operator new and the constructor are
>> constrained by sequence points to taking effect after those
>> (special) functions are entered and before they return.
>> The assignment is (in terms of the abstract machine) constrained
>> by the definition of assignment, which requires that the value
>> of the expression be determined.
>=20
> Again, I refer to the expression "j =3D ++ i".  If i and j are
> initially 0, the above analysis would mean that an asynchronous
> signal could see: i=3D=3D0 && j=3D=3D0, i=3D=3D1 && j=3D=3D0 or i=3D=3D=
1 && j=3D=3D1,
> but never i=3D=3D0 && j=3D=3D1.  This is contrary to the traditional
> interpretation of what is allowed in C.

But not to anything written in the standard, I think.

> Of course, all of these requirements are being radically
> changed, in order to take threading into account.  So the
> actual guarantees will change (although hopefully with no effect
> on currently legal single threaded C++ programs).
>=20
>     [...]
>> Indeed, and the latest drafts use a different notion of
>> sequencing to avoid these discussions (and many more that
>> crop up when we start to consider multi-threading, as that
>> makes much more behavior potentially observable).
>=20
>> However, note that there is a difference also between the
>> "side-effects" of expression evaluation and the very function
>> calls which make up the expression evaluation.  i++ has the
>> side-effect of incrementing i; calling a function is not a
>> mere side-effect, as it affects the value of the expression.
>=20
> Only if the return value is used in the expression.=20

A special case, not affecting the abstract notion of how
expressions are evaluated.

> And a
> constructor of a built-in type isn't a function call, either.

Indeed (it's not even "really" a constructor).

>> In other words, evaluation of expressions in C++ has two
>> aspects: (1) determining the value of that expression, and
>> (2) side-effects (which may modify state, or have otherwise
>> consequences).  The side-effects can be reordered so long
>> as the semantics of the abstract machine (i.e., the specification
>> of what things mean in C++) is not violated.  Reordering an
>> assignment to before the value to assign is evaluated is a
>> violation of these semantics.
>=20
> I think the above is somewhat of a misstatement.

We disagree, of course.

>  =A75/4 says
> quite clearly: "Except where noted, the order of evaluation of
> operands of individual operators and subexpressions of
> indivitual expressions, AND THE ORDER IN WHICH SIDE-EFFECTS TAKE
> PLACE, is unspecified."=20

Indeed, and that's the backdrop for what I've been trying to
explain.  The "Except where noted" is key: the definition of
assignment is, to me, quite explicit in noting the order of
operations.  Evidently it wasn't clear enough though.

> This refers to the abstract machine:
> side effects are not required to take place in the same order
> the corresponding sub-expressions are evaluated, except where
> noted.  What Alf and I can't find is where this is noted for the
> side effect of calling a constructor (or executing the
> initialization of a built-in type) in a new expression.

Probably an oversight -- one of many things that seemed so
obvious as to not need explicit text, but that experience
has shown to benefit from more clarity.

>     [...]
>> Looking over a current draft would be useful then, to check
>> that the changes more clearly express what we want.
>=20
> I'll do so.  It's possible that the issue has been addressed;
> it's (vaguely) related to the problem of double checked locking.
> (Except that if it is addressed in that context, the tendency
> would be to say that it isn't guaranteed.)

In the presence of threads, much more becomes observable,
and the issue is hugely more complicated.  Hopefully the work
done on the C++0x memory model will be sufficient for the MT
case, and hence clearly sufficient for the single threaded
situation.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jdennett@acm.org (James Dennett)
Date: Tue, 9 Oct 2007 14:55:01 GMT Raw View

James Kanze wrote:
> On Oct 7, 9:10 pm, jdenn...@acm.org (James Dennett) wrote:
>> James Kanze wrote:
>>> On Oct 7, 2:21 am, jdenn...@acm.org (James Dennett) wrote:
>>>> Alf P. Steinbach wrote:
>>>>> From discussions in [comp.lang.c++] and [comp.lang.c++.moderated], as
>>>>> well as articles on the net about concurrency in C++, I'm reasonably
>>>>> sure that given
>
>>>>>   #include <iostream>
>>>>>   #include <ostream>
>
>>>>>   struct S { S(){ throw 123; }  int foo(){ return 666; } };
>
>>>>>   int main()
>>>>>   {
>>>>>       S*  p = 0;
>
>>>>>       try
>>>>>       {
>>>>>           p = new S();
>>>>>       }
>>>>>       catch( ... )
>>>>>       {}
>
>>>>>       if( p ) { std::cout << p->foo() << std::endl; }
>>>>>   }
>
>>>>> there is no guarantee that this code will not end up in a call to
>>>>> p->foo() with an invalid pointer p, i.e., that might well happen.
>
>>>>> Surely that couldn't have been the committee's intention?
>
>>>> I wouldn't imagine so.
>
>>>>> Why isn't assignment treated as a function call?
>
>>>> It doesn't need to be.  The assignment cannot occur until the
>>>> new value is known, which means that the "new" operator
>>>> has returned its result, which means that the object has been
>>>> constructed.
>
>>> The construction of the object is a side effect.
>
>> Can you justify that claim?
>
> What else can it be?

Part of the evalation of the expression, used to determine
the result of the expression (as indeed it does).

> Unless it's trivial (which doesn't really
> concern us here), it writes to memory, etc.  Those are side
> effects; the "value" of an expression has no side effects.
>
>> Alf's example illustrates that calling the constructor is
>> needed in order to know whether the expression has a value.
>> The value can't be assigned from if it does not exist.
>
> The "value" of a new expression is the pointer returned from the
> allocator function.

No; it is the address of a newly created object, according
to the standard.  If a constructor throws, there is no such
object, and the new expression does not have a value.

> The compiler needs to know this in order to
> call the constructor.
>
> It is in every way like the expression ++i.

I've talked about the differences.  Claiming that they don't
exist doesn't make it so.

> The modification of
> i is a side effect; the value of the expression is the value
> which will be written, and is available before the side effect
> takes place (and must be available, for the side effect to take
> place).
>
>> The
>> call to the constructor is *not* a side-effect of evaluating
>> the expression; it's an inherent part of determining the value
>> of that expression.
>
> How can that be?  A constructor doesn't return a value.

It doesn't need to return a value to affect the result of
evaluating an expression.  There are plenty of ways in which
even void-returning function can affect the value of an
expression.

>     [...]
>>>> If the constructor throws, there's no value from "new" above,
>>>> and the assignment cannot occur; p will remain null.
>
>>> How is this any different from the compiler generating the
>>> actual assignment in ++i after it uses the value?
>
>> The quirk in that case is that there are additional rules
>> citing additional cases as undefined (if there would be
>> "too many" reads/writes without intervening sequence points).
>> There's simply no way to write code with defined behavior
>> that can observe the timing of the increment in ++i without
>> adding a sequence point such as by writing
>
> It affects the values you might see in the handler of an
> asynchronous signal.  (And I know, there are a lot of weasel
> words there, limiting what you can legitimately do.)
>
>> void observe(int, int*) { ... }
>> .
>> observe(++i,&i);
>
>> and as soon as we do that, the sequence point changes things
>> so that the side-effect must occur before the call to the
>> function.  Naturally the inability to observe the result means
>> that the "as if" rule allows re-ordering.  That does not apply
>> in the original example in this thread.
>
> You don't need the "as if" rule.  The standard explicitly states
> that "side effects" can take place in any order, not necessarily
> the order in which the sub-expressions which cause them are
> evaluated.  And that applies to the abstract machine; the "as
> if" rule is not necessary.

Such an extended interpretation of the freedom to rearrange
code would be most problematic; I can see why people are concerned,
if they think implementors would really do such things.

> The real question, of course, is whether calling the constructor
> is a side effect.  To be frank, I don't really see how it can be
> considered anything else, given the usual meaning of side
> effect.  Could you elaborate why it isn't a "side effect".

I believe I've tried to do so: it is *impossible* to determine
the result of a new expression while ignorant of the body of
a constructor which is used by that new expression.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Yechezkel Mett <ymett.on.usenet@gmail.com>
Date: Tue, 9 Oct 2007 09:57:28 CST Raw View

On Oct 9, 1:09 am, AlbertoBarb...@libero.it (Alberto Ganesh Barbati)
wrote:
> James Kanze ha scritto:
>
> > On Oct 7, 8:01 pm, AlbertoBarb...@libero.it (Alberto Ganesh Barbati)
> > wrote:
>
> >> As the assignment is sequenced after the the computation of the right
> >> operand, it should not occur when such computation terminates
> >> prematurely because of a exception.
>
> > I'm not sure that that changes anything.  The problem is that
> > calling the constructor isn't part of the "value computation" of
> > a new expression, it is a side effect.  And unless side effects
> > are "sequenced before", we're back where we started.
>
> A side effects may not be part of the "value computation", but the
> "initiation" of the side-effect *is* part of it, see 1.9/13.

1.9/13 says

"Evaluation of an expression (or a sub-expression) in general includes
both value computations (including determining the identity of an
object for lvalue evaluation and fetching a value previously assigned
to an object for rvalue evaluation) and initiation of side effects."

which quite clearly says that initiation of side effects is _not_ part
of the value computation (although it is part of the evaluation).
Also, we could have an interesting discussion on the meaning of
"initiate" -- I think it could mean "to cause to start", as opposed to
"to start".

Yechezkel Mett

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Tue, 9 Oct 2007 09:56:50 CST Raw View

On Oct 9, 1:00 am, a...@acm.org ("Andrew Koenig") wrote:
> ""Alf P. Steinbach"" <al...@start.no> wrote in
> messagenews:13gh5pe200j4m3c@corp.supernews.com...

> > Added to that, in the only Defect Report (slightly) relevant to this
> > issue, DR 222 [1], Andrew Koenig argued that

> >   "One way to deal with this issue [which is not exactly the above]
> >   would be to include built-in operators in the rule that puts a
> >   sequence point between evaluating a function's arguments and
> >   evaluating the function itself. However, that might be overkill:"

> > I.e., it might be "overkill" to make all built-in operators act formally
> > like function calls.

> > Whether Andrew thought it would also be "overkill" for just
> > assignment isn't clear, but since DR 222 is mainly about
> > assignment we can conclude that in his opinion at the time,
> > the standard did not specify complete evaluation of
> > arguments before assignment effect.

> I think I clarified that opinion in the sequel to what you
> quoted.  I gave the following example:

>     x[i++] = y;

> and said that I didn't see any reason to require that i must
> be incremented before the assignment to x[i] takes place.

If x, y and i are basic types, the "as if" rule will allow any
ordering the compiler feels like.  If they aren't, the operators
are overloaded, and the sequence points of the function calls
ensure that the incrementation actually does take place before
the assignment.

In more complicated cases, e.g. something like i++ + j++, it
does make a difference if i and j have user defined types, and
the ++ operator can raise an exception.  If j++ raises an
exception, has i been incremented, or not?  The possibility of
exceptions changes the situation dramatically with regards to C.

> For that matter, in

>     x = y[i++];

> I see no reason to require that i be incremented before the assignment
> either.

Same reasoning as above.

> However, in each of these examples, the side effect of
> incrementing i does not affect the values that participate in
> the assignment.  In contrast, if we write

>     vector<int>* p = 0;
>     p = new vector<int>(42);

> I think it is reasonable to assume that

>     new vector<int>(42)

> must be evaluated before its result is assigned to p.

I think it reasonable, too.  I don't see any wording in the
standard which indiscutably says this, however.  Calling the
constructor is a side effect of the expression, and the standard
says that side effects my be reordered.

> In other words, allocating the memory that the vector will
> control is not a side effect--it is an essential step in
> determining the value that will be assigned, and that value is
> not meaningful until after that memory has been allocated.

There is a cause and effect involved.  Both allocating the
memory and calling the constructor are side effects.  Until the
memory has been allocated, however, it is impossible to
determine the value of the expression, so simple causation
ensures that this side effect must take place before any
operations which use the value.  Calling the constructor,
however, is not necessary to determine the value, so the
standard (as currently worded) places no restraints other than
the next sequence point on when it occurs.

> Therefore, I think that the compiler should not be permitted
> to rewrite this code in a way that gives a nonzero value to p
> if an exception is thrown.

I certainly agree.  I think words should be added to the
standard to require this.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: James Kanze <james.kanze@gmail.com>
Date: Tue, 9 Oct 2007 11:58:48 CST Raw View

On Oct 9, 1:09 am, AlbertoBarb...@libero.it (Alberto Ganesh Barbati)
wrote:
> James Kanze ha scritto:

> > On Oct 7, 8:01 pm, AlbertoBarb...@libero.it (Alberto Ganesh Barbati)
> > wrote:

> >> As the assignment is sequenced after the the computation of the right
> >> operand, it should not occur when such computation terminates
> >> prematurely because of a exception.

> > I'm not sure that that changes anything.  The problem is that
> > calling the constructor isn't part of the "value computation" of
> > a new expression, it is a side effect.  And unless side effects
> > are "sequenced before", we're back where we started.

> A side effects may not be part of the "value computation", but the
> "initiation" of the side-effect *is* part of it, see 1.9/13.

> Moreover 1.9/16 specifies that the execution of a function is
> always sequenced (either before, after or indeterminately)
> with other evaluation in the calling function. This means,
> that once the execution of a function is initiated, it must be
> completed before any other evaluation in the calling function.
> I interpret this as a guarantee that the entire constructor
> body shall be executed before the assignment. Do you see
> anything wrong with this interpretation?

Just that it seems to be reading more into the statement than
was intended.

What I'd really like to see is a full ordering of expressions,
with any reordering (like that Andy wants to allow) a
consequence of the "as if" rule.  This means that if any part of
the expression raises an exception, you have fully defined and
specified state.  But that seems more than what the committee is
willing to guarantee.  Failing that, some explicit statement to
the effect that all of the side effects in a new expression must
occur before its value is used would make things a lot clearer.
Because of course, you don't need exceptions to cause problems
otherwise:

    typedef int (*PFI)() ;

    int f() { return 42; }

    int i = (*new PFI( f ))() ;

for example.  (Note that there is no "constructor call" to
possibly introduce sequence points.  Not that that's relevant,
since the sequence points introduced by the constructor call
only introduce a partial ordering, which doesn't affect the
assignment.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: hyrosen@mail.com (Hyman Rosen)
Date: Tue, 9 Oct 2007 17:49:54 GMT Raw View

James Kanze wrote:
> What I'd really like to see is a full ordering of expressions

It has been said that the definition of insanity is to keep
repeating the same actions while hoping for different outcomes.

Experience has proven that allowing limited flexibility in order
of evaluation is bad. It is bad for programmers, because they may
rely on a particular order without knowing that they are doing so.
It is bad for compiler writers, because they may be forced into
backward compatibility traps with no way of knowing which old
accidental behavior must be preserved. It is bad for standards
authors, because they have a mental model of what they are trying
to achieve but always fail to properly express it in writing,
allowing ever more loopholes to exist and corner cases to fail
(as we see from the current discussion).

It should be abundantly clear by now that order of evaluation,
including when side effects happen, must be specified exactly.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: jpalecek@web.de (Jiri Palecek)
Date: Tue, 9 Oct 2007 19:41:50 GMT Raw View

Alf P. Steinbach wrote:

> * Greg Herlihy:
>>
>> The program reaches plenty of sequence points between the evaluation
>> of the new-expression and the completion of the assignment operation.
>> As the Standard itself points out, a new expression makes a function
>> call (actually, several of them). And a C++ program arrives at a
>> sequence point whenever a called  function is entered - and arrives at
>> another upon exit. So, based on the Standard's definition of a
>> sequence point:
>>
>> "At certain specified points in the execution sequence called sequence
>> points, all side effects of previous evaluations shall be complete and
>> no side effects of subsequent evaluations shall have taken place."
>> intro.execution/7]
>>
>> we can conclude that - at the point when the new-expression makes its
>> function call - the assignment to p must a) either be over or b) must
>> have not yet begun. Well, since the value assigned to "p" is dependent
>> on the value returned by the new-expression, the only possibility is
>> that the assignment to "p" must fall in the evaluations "not-yet-
>> started" category. Therefore we are assured that when the function
>> call is made, p will still have its last-assigned value (the null
>> pointer constant).
>
> Here the dependency "on the value returned by the new-expression" is, as
> I understand it, really a dependency on the possible
> not-returning-a-value by a throwing constructor, and otherwise the
> dependency is only on the allocator function call (permitting the
> reordering, your option (a) for that function call).
>
> Assuming the above argument holds, then, the Meyers/Alexandrescu
> assumption[1] also holds, that the reordering shown below is permitted
> when the compiler can prove that the constructor doesn't throw.
>
>    Singleton* Singleton::instance()
>    {
>        if (pInstance == 0)
>        {
>            Lock lock;
>            if (pInstance == 0)
>            {
>                pInstance =                      // Step 3
>                operator new(sizeof(Singleton)); // Step 1
>                new (pInstance) Singleton;       // Step 2
>            }
>        }
>        return pInstance;
>    }
>
>    <quote>
>    there are conditions under which this transformation is legitimate.
>    Perhaps the simplest such condition is when a compiler can prove that
>    the Singleton constructor cannot throw (e.g., via post-inlining flow
>    analysis), but that is not the only condition. Some constructors that
>    throw can also have their instructions reordered such that this
>    problem arises.
>    </quote>
>
> I'm now quoting in full what the article actually says because earlier
> in the thread I erred by paraphrasing the above quote, saying that it
> stated that the rewrite can "only" occur when the constructor is
> provably non-throwing, which is less permissive than the actual text.
>
> So that seems to leave an interesting possibility of safe double-checked
> locking pattern using a constructor that can't be proven by the compiler
> to not throw  --  which is easy enough to arrange, e.g. by dependency on
> dynamic data, e.g. checking a global initialized from a main() argument.
>
>    S::S()
>    {
>        // Some initialization here, then:
>        if( strcmp( ::dynData, ::someUuid ) == 0 ){ throw "never"; } }
>    }
>
> Yet, the Meyers/Alexandrescu article states categorically that
>
>    <quote>
>    DCLP will work only if steps 1 and 2  are completed before step 3 is
>    performed, but there is no way to express this constraint in C or
>    C++.
>    </quote>

I think one of the problems with your solution with a randomly throwing
constructor could be a compiler can still rewrite it like this:

   Singleton* Singleton::instance()
   {
       if (pInstance == 0)
       {
           Lock lock;
           if (pInstance == 0)
           {
             // we know pInstance is 0 here
               pInstance =                      // Step 3
               operator new(sizeof(Singleton)); // Step 1
             try {
               new (pInstance) Singleton;       // Step 2
             } catch(...) {
               delete (void*)pInstance;
               pInstance=0;
               throw;
             }
           }
       }
       return pInstance;
   }

And this transformation is legal (at least I think) as long as
Singleton::Singleton() (even in case of an exception) doesn't call
Singleton::instance() (and the allocation and deallocation function
likewise). You could only see the pointer to the non-constructed from an
async signal (but that's OK, because even the pointer is not sig_atomic_t,
so can be anything in an async signal) or from a different thread, but the
standard doesn't say anything about threads.

Even if you somehow forced your compiler not to rewrite the code in any
malicious way, you'd still need some memory barrier after the constructor,
but before the assignment takes place, otherwise other threads might see
the object, which was initialised, but the stores that initialised it are
somewhere lagging in the hardware. You need something like

Singleton* Singleton::instance() {
  if (pInstance == 0) {          // 1st test
    Lock lock;
    if (pInstance == 0) {        // 2nd test
      Singleton* tmp=new Singleton;
      /* memory barrier */ volatile __asm ("mfence\n");
      pInstance = tmp;
    }
  }
  return pInstance;
}

This way, pInstance, if not null, points always to a fully constructed
object (and because of the barrier, everybody on the bus sees it), is never
assigned twice (because of the lock) and unless construction fails, every
call to instance() returns a valid (and the only one) object. However,
that's nonstandard, but the standard doesn't say anything about threads so
we're stuck here.

> "there is no way to express this constraint"  --  i.e. not even the
> S::S() constructor shown above is safe from willy-nilly assignment of
> result pointer before the constructor body's execution has finished.
>
[snip]
>
> So, I'm at a loss, since I now find your argument, that the assignment
> to result pointer has to happen after the constructor body's execution
> if the constructor can throw, quite convincing: I started writing a
> rebuttal but had to delete and write this instead. ;-)

I don't think that's true - certainly, the compiler has only to ensure the
execution goes as-if the assignment appeared after the construction. Which
means you could not see the unconstructed object in your original example -
the one where you called p->foo() - partly because the result of
new-expression is a pointer to a constructed object, and if construction
fails (constructor throws), the object is deleted, therefore there is no
pointer to it which could be assigned to p. However, if the compiler
reorders the assignment and you just cannot see it from inside of the
program, you cannot prevent that.

  Jiri Palecek

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]