Topic: Sequencing dynamic initialization safely


Author: francis@robinton.demon.co.uk (Francis Glassborow)
Date: Wed, 17 May 2006 18:51:51 GMT
Raw View
In article <1147779416.822263.241570@y43g2000cwc.googlegroups.com>,
Manfred von Willich <manfred@techniroot.co.za> writes
>Francis Glassborow wrote:
>> The need for a new keyword is not a killer but because the cost is high
>> the benefits need to be as well.
>
>I'd appreciate a more clarity on what you are referring to as the cost.
> I presume you are referring to the cost of modifying compilers (and
>linkers).  There is no execution overhead in what I have in mind.


No the cost of modifying compilers etc. is usually minuscule (and
probably much less than the cost of modifying compilers to handle yet
another bizarre overload of 'static') The cost is in all the existing
code that gets broken by the new keyword. There is no such thing as a
pure extension (i.e. one that has no impact on existing code) if it uses
a new keyword.

--
Francis Glassborow      ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred von Willich" <manfred@techniroot.co.za>
Date: Thu, 18 May 2006 00:54:10 CST
Raw View
Ganesh, I missed your post when I answered Francis, for which I
apologise.

> Anyway, remember that you must not convince only the programmers but
> compiler vendors also! They are the ones that would need to implement
> your proposed language change. In their respect also, the cost of
> implementing the feature needs to be worth the gain.

I take this to be what Francis was referring to when he mentioned cost.
And given that many programmers are probably not aware of the problem,
compiler vendors are not necessarily going to feel compelled to go to
the effort.

> For example, if you add a keyword "safe"
> then this valid C++ program will stop compiling, regardless of the use
> of "safe" that you have in mind

Yes, this collision of previously valid identifiers with new keywords
for some reason escaped me, about which I am duly embarrassed.  You
would have to use "ugly" keywords to be safe (not that I have a problem
with this), or do something like follow Tom   s's suggestion (also a
reasonable way forward in my opinion, though I know it will be shouted
down).

> but... as you can see from the other (few)
> posts, this proposal hasn't stirred very much interest

Yes, I'm afraid this is a telling point (despite my leaking roof
metaphor).  If I can't gain support in what has been looked at many
times, then I am bound to get nowhere.

I'd say that I must concede that it is too late for me to introduce any
repair to an existing problem - I guess I was all fired up, and was not
aware of all the discussions you have been part of.  If I may be so
bold as to say so, the dynamic initilization mechanism was introduced
over-hastily, locking in a problem, but that is history now.  It is
evident that with the classes of C++, such a mechanism is essential,
but when it was introduced I imagine the designers thought that either
(a) they could fix it later, or (b) the programmer could manage it,
neither of which is really true.

I can mention a number of other pet dislikes in C++ that follow the
same mould that will be unfixable for the same reason.  For example,
the so-called safe implicit upcast is not so because of the interaction
with pointer arithmetic and non-virtual destructors.  An opportunity
for not introducing several (retrospectively obvious) problems in the
transition from C to C++ was messed up.  This is the point where the
romantic but pragmatic language purists must be taken a little more
seriously.  I think that the overall result is that C++ may stagnate.

So, my trying to fix this (or any of the similar issues) in the
standard is not going anywhere.  Given the ground that I have conceded
(e.g. that a programmer must elect to use the mechanism), and the
difficulty with keywords, I may take this in a different direction -
working within the existing language, possibly using inheritance or
templates (though not macros - I abhor the preprocessor).  This is
going to be decidedly tricky due to the rules around implicit
conversions and templates.

In conclusion, I'm not going to push this thread any more, though I
will always welcome discussion.  Thanks to all for the feedback - I
have certainly learned a lot about how things work and how they stand.

Manfred


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: Thu, 18 May 2006 11:29:49 CST
Raw View
In article <1147847370.409244.192080@i40g2000cwc.googlegroups.com>,
Manfred von Willich <manfred@techniroot.co.za> writes
>I'd say that I must concede that it is too late for me to introduce any
>repair to an existing problem - I guess I was all fired up, and was not
>aware of all the discussions you have been part of.  If I may be so
>bold as to say so, the dynamic initilization mechanism was introduced
>over-hastily, locking in a problem, but that is history now.  It is
>evident that with the classes of C++, such a mechanism is essential,
>but when it was introduced I imagine the designers thought that either
>(a) they could fix it later, or (b) the programmer could manage it,
>neither of which is really true.

But actually two things being actively worked on for C++0x actually do
work towards fixing or at least mitigating the problem.

The work on generalising const expressions will result in far more
opportunities for compile time initialisation. Acceptance of the
proposal for modules will go a long way in reducing the order of dynamic
initialisation problems.


--
Francis Glassborow      ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Thu, 11 May 2006 22:42:25 GMT
Raw View
Manfred von Willich ha scritto:
> Ganesh wrote:
>> About as much complexity as the "const" qualifier??? That's VERY much,
>> then. Think about const-correctness... it's a revolution that haven't
>> been fully digested yet by a lot of programmers. I was expecting
>> something really simpler...
> Well, it gives some indication.  On the surface, const-qualification is
> simple for the programmer (... except that certain counter-intuitive
> examples keep popping up), and moderately simple for the compiler (...
> except that the rules turned out to be more complex than most compiler
> writers initially suspected).  My suggestion is also simple on the
> surface, and I hope the subterranean complexity does not grow as much.

I have read your proposal and although I have to admit that not
everything is clear to me, it looks very complex to implement to me. Of
course as I am not a compiler developer I can be wrong, but that's the
feeling I get.

> I am going to start back-to-front, by giving two different sets of
> defaults (to provide a context for thinking about how it will seem to
> the programmer).  Naturally, standardization will have to choose only
> one default (i.e. what happens when no keywords are applied).
>
> Default behaviour A:  All dynamic initialization is ordered safely, at
> the cost of some overhead.  The programmer does not have to worry about
> it until optimisation is needed.  Consequence: most programmers will
> remain unaware of the possibility of getting rid of most of this
> overhead, but no new warnings will appear (convenient).

Unacceptable. It violates the C++ golden rule "don't pay for what you
don't use". Moreover, existing valid code recompiled with a new compiler
might have a decrease in performances and/or increase in code bloat.
That's not inherently a bad thing, but a lot (if not most) programmers
will just disable the feature in fear of the increased overhead, just as
they still do for EH and RTTI.

> Default behaviour B: All dynamic initialization is done without added
> overhead.  When the compiler cannot determine whether use of the an
> object is safe, it generates a warning/error.  The programmer can then
> go about inserting qualifying keywords to indicate what initialization
> is safe, and when to add overhead to make it safe.  When a warning-free
> state is reached, the program is safe.  If overhead had to be added,
> the program could originally have generated undefined behaviour,
> depending on the order of initialization of the translation units
> determined by the linker.  Consequence: programmers will become savvy
> to the issue very quickly, coached by the warnings.

Issuing errors is not unacceptable, because it might break existing
code. You might argue that the code would have been buggy anyway, but
that depends on the accuracy of the detection algorithm. If there's even
one case that proves a false positive, then we can't follow this path.

Issuing warnings might be a better solution, although I'm not very fond
of it.

> The approach is as follows:
>
> <approach snipped>
>

I couldn't follow all of it. What bother me most is that this kind of
approach will tend to "creep" into everything. You say "A safe function
is disallowed from accessing any unsafe variable or function." What
about calling through a function pointer or a pointer to member function
or, even worst, through a tr1::function object? I'm not sure the
compiler could be able to track all this unless every piece of code you
write is made "safe-aware". Again, I could be wrong...

> Multi-threading requires a mutex in the initialization routine (this is
> easily proved - there has to be a mechanism to halt all but one thread
> until initialization is complete).  This in turn gives rise to the
> issue of deadlocking, which I haven't thought through fully (though I
> think it is only possible when multi-threaded applications have cyclic
> guarded initialization, which will generate pathological behaviour on
> every run (detection, stack overflow or deadlock).  Note that the mutex
> should only be used AFTER the flag is tested, which means that the
> execution overhead is only a test-and-jump once the variable has been
> fully initialized.

Ahhh! Multithreading. This issue is completely different. I'm sorry but
I don't want to enter this discussion. I did it in the past, you may
google for that if you want. Fact is that you may search the C++
standard for the word "thread" and it occurs only once in 15.1/2, a
context that is unrelated to parallel processing. The standard, as it is
now, does not mention multithreaded applications. Until it does, it
doesn't make any sense to speak about initialization of static variables
in such an environment.

> A side note: How many people are aware that local static variables are
> typically implemented as what I call variables with guarded dynamic
> initialization?

The only thing that counts is that, with a few exceptions, static local
variables are initialized "the first time control passes through its
declaration". How that is accomplished is not important. However, I
remember I read about guarded initialization in at least two books.

> Except that I t think that a mutex is typically not
> used - certainly not by VC6.0.  In this compiler, cyclic dependencies
> lead to undefined behaviour, but no stack overflow (i.e. it is not
> trivially diagnosed), and they are NOT thread safe.  Unfortunate timing
> of thread switching can lead to double initialization and to use of the
> pre-initialized state.  GCC fans: you might want to check whether GCC
> uses a mutex.  If not, local static variables are either unsafe in for
> use in dynamic initialization or multi-threaded contexts (which depends
> upon the "when" of initialization selected by the compiler).

A mutex is not used because the standard doesn't mandate it and that's
again because the C++ execution models does not consider multithreading.
Yes, initialization of local static variable is not thread-safe. The
issue has been brought up in this newsgroup and in
comp.lang.c++.moderated quite a few times. You may want to google the
archives.

Regards,

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred von Willich" <manfred@techniroot.co.za>
Date: Fri, 12 May 2006 12:18:25 CST
Raw View
Am I getting it wrong, or is
(a) this a genuine defect (the programmer is expected to manage a
diffuse problem - i.e. that a locally applied coding policy -other than
"strictly avoid dynamic initialization"- cannot in general fix), and
(b) a number of straightforward solutions that involve the standard are
possible.
The combination of these factors gives an imperative to the issue.
These are brave assertions, but I'll happily oblige when I am called
upon to justify them.


John Nagle wrote:
>      Modula II got this right.  [...]  If [...] no such order can be
> found, the link fails with an error.
>      The "don't change the linker" tradition of C and C++ [...] prevents this.

Your comment stimulated me to think of a simple solution that involves
no change to linkers and a small change to the standard - mandating
verification of the order of dynamic initialization for safety.  This
is the minimum needed.

There must be a clearcut example solution available before what is
stated only as an objective can be mandated.  Consider the following
illustration of a solution:

The compiler uses two names (distinguished via decoration) for each
function and each object, one "safe" and one "unsafe".  The safe name
is not published if the function/object is not safe.  The two names may
relate to two distinct copies of a function (identical except for which
names are referenced).  Dynamically initialized objects are not safe.
Some rules determine which references may be used in any given context.
 The linker will fail ("unresolved symbol") exactly when a dynamic
initialization depends upon another but where their order of
initialization is undefined.

If you allow the addition of a few keywords (used to control whether a
function/object is "safe"), the same solution applies without dual
names, again with no linker modification, with no obj/lib expansion,
and with compile-time detection of the violation.  A similar solution
involves no keywords, but involves modifying the linker to manage this
distinction - every defined name carries a "safe"/"unsafe"
qualification, and every reference carries a "safe"/"any" requirement.
Violation detection is then at link time.

This is sparse in detail (e.g. "rules" omitted), but I believe that
this can at least demonstrate that a satisfactory simple solution is
possible.  Should we not at least move towards acceptance of this
minimum?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred von Willich" <manfred@techniroot.co.za>
Date: Sat, 13 May 2006 11:09:47 CST
Raw View
Ganesh, I'm getting bogged down by having to tiptoe around golden
rules.  Let's just forget about specific solutions in first deciding
whether the dynamic initialization is defined in an unfortunate form.
If it is, is it worth exploring enhancements to get around its
shortcomings?  If so, I feel that there will be worthwhile improvements
possible, even within fairly tight constraints.

If you agree with me that it an improvement is worth looking for, even
at the cost of a little compromise (and I would appreciate you letting
me know how you feel on this one), then we can look at specifics.

And if I haven't offended you, how about this as an outline of a
non-specific approach:
   (a) Existing programs are unaffected - no keyword means
"unprotected" with regard to dynamic initialization, and remaining
exactly as per the existing standard.
   (b) New qualifying keywords may be introduced to tag functions and
variables to facilitate initialisation sequencing verification and/or
control.  Their use is optional, and guarantees some level of
sequencing correctness not provided for in the existing standard.

You can see that your input is valuable in shaping my approach - I just
hope to get somwhere before I get too discouraged.  This is where I may
be able to contribute value - I think I am good at finding solutions
within tight constraints.  I need to understand the constraints, and
part of the problem is that your perception of them is not necessarily
the consensus view.  Maybe you're more insistent than most on the
strictness of the "golden rules"?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: John Nagle <nagle@animats.com>
Date: Sat, 13 May 2006 22:58:52 CST
Raw View
Manfred von Willich wrote:
> Ganesh, I'm getting bogged down by having to tiptoe around golden
> rules.

    That's normal.  You can't really fix anything unsafe in C++.
You'll always run into one of the following arguments:

-- It will break existing code, even if that code is already wrong.
-- You can't add new keywords, because that will break existing code.
-- You can't change the linker.
-- You can't cause additional run-time overhead.
-- There's some obscure case in which doing something unsafe is useful.

All of those apply here, and given those constraints, you can't fix the problem.
That's why Java, C#, and C++/CLI had to be invented.

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: NULL@NULL.NULL ("Tom s")
Date: Sun, 14 May 2006 18:51:38 GMT
Raw View
John Nagle posted:


> -- You can't add new keywords, because that will break existing code.


Let's say we add the keyword "restrict"... all you have to do is run a=20
little program which replaces all instances of "restrict" with "ReStRiCt"=
=20
or something like that.


-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred von Willich" <manfred@techniroot.co.za>
Date: Sun, 14 May 2006 16:54:27 CST
Raw View
John Nagle wrote:
>     That's normal.  You can't really fix anything unsafe in C++.

Am I missing something here?  The language definition has been changed
over the last decade or so.  To the better, I might add.

> -- It will break existing code, even if that code is already wrong.

For clarity, I am no longer trying to fix anything unsafe - I am merely
trying to add a feature that allow the programmer to write safe code
with less effort (especially where no overhead is desired).

> -- You can't add new keywords, because that will break existing code.

Not true, as long as the code without added keywords still means the
same thing.  I indicated in my previous post that I have accepted that
existing code must still compile and run in exactly the same way, and
that it must be easy to demonstrate that this will be the case.  And,
by way of counterexample, the keywords bool, mutable, namespace and the
like seem to have been added after the debut of C++.

> -- You can't change the linker.

Well, maybe.  I think this should be restated as: "It must still work
with existing linkers".  Changed linkers provide benefits even though
it must work if they are not changed.  Nevertheless, I accept this
constraint as stated.

> -- You can't cause additional run-time overhead.

. to existing constructs.  Fine.  But if a programmer explicitly
chooses to use a new feature that was not there before, this is not
adding overhead.

> -- There's some obscure case in which doing something unsafe is useful.

Yes - this is traditionally pretty core to C and C++.  Give the
programmer the freedom to override pretty much anything, if he
specifically chooses to do so.

> All of those apply here, and given those constraints, you can't fix the problem.

I differ on the conclusion, largely because I do not accept the keyword
constraint unmodified, and because I am no longer trying to fix the
unsafe aspect - only providing an alternative.

In my last post I had pretty much figured out and accepted all the
contraints (barring keyword addition) with Ganesh's patient tutelage.
So, if we can revise the constraint on keywords to "the addition of
keywords must in no way affect existing compiling programs", I think I
may have a way of adding a useful new degree of control on
initialization sequence for use at the discretion of the programmer.

For the moment, ignoring whether I actually have something that does
what I claim, do you at least accept my logic and the revised keyword
constraint?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: francis@robinton.demon.co.uk (Francis Glassborow)
Date: Mon, 15 May 2006 15:40:38 GMT
Raw View
In article <1147633394.483172.14700@u72g2000cwu.googlegroups.com>,
Manfred von Willich <manfred@techniroot.co.za> writes
>In my last post I had pretty much figured out and accepted all the
>contraints (barring keyword addition) with Ganesh's patient tutelage.
>So, if we can revise the constraint on keywords to "the addition of
>keywords must in no way affect existing compiling programs", I think I
>may have a way of adding a useful new degree of control on
>initialization sequence for use at the discretion of the programmer.

The need for a new keyword is not a killer but because the cost is high
the benefits need to be as well.


--
Francis Glassborow      ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Tue, 16 May 2006 14:55:29 GMT
Raw View
Manfred von Willich ha scritto:
> Ganesh, I'm getting bogged down by having to tiptoe around golden
> rules.  Let's just forget about specific solutions in first deciding
> whether the dynamic initialization is defined in an unfortunate form.
> If it is, is it worth exploring enhancements to get around its
> shortcomings?  If so, I feel that there will be worthwhile improvements
> possible, even within fairly tight constraints.
>
> If you agree with me that it an improvement is worth looking for, even
> at the cost of a little compromise (and I would appreciate you letting
> me know how you feel on this one), then we can look at specifics.

Yes, I agree with you that improvements might be made in this area, but
if we are going to impose a compromise onto people who don't want or
need or find useful such improvement, the compromise might be
acceptable. Unfortunately, in this specific case the gain that would be
obtained for such an improvement is, IMHO, very little, so the
compromise must be accordingly very little.

Anyway, remember that you must not convince only the programmers but
compiler vendors also! They are the ones that would need to implement
your proposed language change. In their respect also, the cost of
implementing the feature needs to be worth the gain.

> And if I haven't offended you, how about this as an outline of a
> non-specific approach:
>    (a) Existing programs are unaffected - no keyword means
> "unprotected" with regard to dynamic initialization, and remaining
> exactly as per the existing standard.
>    (b) New qualifying keywords may be introduced to tag functions and
> variables to facilitate initialisation sequencing verification and/or
> control.  Their use is optional, and guarantees some level of
> sequencing correctness not provided for in the existing standard.

As noted by other posters, introducing new keywords is expensive and can
break existing programs, because the C++ keyword model does not allow
context-dependent keywords. For example, if you add a keyword "safe"
then this valid C++ program will stop compiling, regardless of the use
of "safe" that you have in mind:

int main()
{
  int safe = 0;
  return safe;
}

So, unless you use reserved (ugly) names like __safe or _Safe, whatever
spelling you choose for your keyword, it might potentially break
existing code.

> You can see that your input is valuable in shaping my approach - I just
> hope to get somwhere before I get too discouraged.  This is where I may
> be able to contribute value - I think I am good at finding solutions
> within tight constraints.  I need to understand the constraints, and
> part of the problem is that your perception of them is not necessarily
> the consensus view.  Maybe you're more insistent than most on the
> strictness of the "golden rules"?

Of course it may be that I more insistent than others about "golden
rules", but I got that attitude from reading this newsgroup ;) I don't
want to discourage you, but... as you can see from the other (few)
posts, this proposal hasn't stirred very much interest and their
objections are much the same as mine.

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred von Willich" <manfred@techniroot.co.za>
Date: Tue, 16 May 2006 09:56:00 CST
Raw View
Francis Glassborow wrote:
> The need for a new keyword is not a killer but because the cost is high
> the benefits need to be as well.

I'd appreciate a more clarity on what you are referring to as the cost.
 I presume you are referring to the cost of modifying compilers (and
linkers).  There is no execution overhead in what I have in mind.

The perception of the value of the benefits will naturally be
subjective.  When you have lived your life with a bucket under the drip
from your ceiling, you don't notice it, and the effort of sealing the
roof may not be worth the effort.  Saving development time by catching
problems via a static dependency rather than via debugging presumably
has value.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred" <manfred@techniroot.co.za>
Date: 9 May 2006 20:30:02 GMT
Raw View
The C++ standard (what I have seen of it) - and probably ditto C -
provides very little control over the order in which dynamic
initialization of objects of static storage duration ("static") occurs.
 In particular, when two static objects in different translations units
are to be initialized at run-time, and the initialization of one
depends on the second one, there seems to be no way to control the
order of initialization, and the behaviour is thus undefined.  Many
programmers aren't even aware of the problem until it bites them, yet
it is difficult to program some classes so that they do not use of
static variables in their initialization (in the contructor or in the
parameters to the constructor).  If I have it straight, this is a
pretty nasty defect in the language(s).

Of course, we can forgo the dynamic initialization and replace accesses
to each dynamically initialized variable with a parameterless function
that returns a reference to the variable after it has initialized it.
Then when we access the the first variable, the second will be
initialized first before using its value - e.g.

   --- translation unit x.cpp:
   extern int y;
   int x = y;
   --- translation unit y.cpp:
   int y = time();

Safer "guarded" replacement (has problems, only for illustration):

   --- translation unit x.cpp:
   extern int & y();
   int & x () {
      // local static variable initialized on first call to x()
      static int xx = y();
      return xx;
   }
   --- translation unit y.cpp:
   int & y () {
      // local static variable initialized on first call to y()
      static int yy = time();
      return yy;
   }
  ... int z = x(); x() += 1; // example use of x();

This carries a small overhead to which some will object.  I have a
proposal to address both the initialization sequence and overhead, such
that the compiler can track whether a variable is dynamically
initialized, and the overhead is only incurred where the program would
have had undefined behaviour according to the existing standard.  It
involves using some explicit use of keywords to qualify static variable
declarations where default behaviour is to be overidden.  This probably
has about as much complexity as the "const" qualifier.

I think you can see where I'm heading with this.  I'll be happy to post
a lot more detail on this topic (I have written up a few pages on it)
if anyone is interested.  Just want to keep it shortish for now.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Tue, 9 May 2006 22:11:45 GMT
Raw View
Manfred ha scritto:
> <motivation snipped>
> I have a
> proposal to address both the initialization sequence and overhead, such
> that the compiler can track whether a variable is dynamically
> initialized, and the overhead is only incurred where the program would
> have had undefined behaviour according to the existing standard.  It
> involves using some explicit use of keywords to qualify static variable
> declarations where default behaviour is to be overidden.  This probably
> has about as much complexity as the "const" qualifier.

About as much complexity as the "const" qualifier??? That's VERY much,
then. Think about const-correctness... it's a revolution that haven't
been fully digested yet by a lot of programmers. I was expecting
something really simpler...

> I think you can see where I'm heading with this.  I'll be happy to post
> a lot more detail on this topic (I have written up a few pages on it)
> if anyone is interested.  Just want to keep it shortish for now.

Please go ahead, I'm interested.

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Manfred von Willich" <manfred@techniroot.co.za>
Date: 10 May 2006 16:50:08 GMT
Raw View
Ganesh wrote:
> About as much complexity as the "const" qualifier??? That's VERY much,
> then. Think about const-correctness... it's a revolution that haven't
> been fully digested yet by a lot of programmers. I was expecting
> something really simpler...
Well, it gives some indication.  On the surface, const-qualification is
simple for the programmer (... except that certain counter-intuitive
examples keep popping up), and moderately simple for the compiler (...
except that the rules turned out to be more complex than most compiler
writers initially suspected).  My suggestion is also simple on the
surface, and I hope the subterranean complexity does not grow as much.


I am going to start back-to-front, by giving two different sets of
defaults (to provide a context for thinking about how it will seem to
the programmer).  Naturally, standardization will have to choose only
one default (i.e. what happens when no keywords are applied).

Default behaviour A:  All dynamic initialization is ordered safely, at
the cost of some overhead.  The programmer does not have to worry about
it until optimisation is needed.  Consequence: most programmers will
remain unaware of the possibility of getting rid of most of this
overhead, but no new warnings will appear (convenient).

Default behaviour B: All dynamic initialization is done without added
overhead.  When the compiler cannot determine whether use of the an
object is safe, it generates a warning/error.  The programmer can then
go about inserting qualifying keywords to indicate what initialization
is safe, and when to add overhead to make it safe.  When a warning-free
state is reached, the program is safe.  If overhead had to be added,
the program could originally have generated undefined behaviour,
depending on the order of initialization of the translation units
determined by the linker.  Consequence: programmers will become savvy
to the issue very quickly, coached by the warnings.

The approach is as follows:

The compiler tracks the initialization safety of every static variable,
and of every function.  Variables can have one of (a) static
initialization ("safe"), (b) guarded dynamic initialization (also
"safe", but with additional linkage and usage details), or (c)
unguarded dynamic initialization ("unsafe", which may conditionally be
treated as "semi-safe" in the translation unit defining it).  Functions
can be "safe" or "unsafe" (and similarly "semi-safe").  The compiler
does not make any intelligent choices (except perhaps relating to
variables and functions without external linkage) - keywords dictate
the choice.

Guarded initialization:  Guarded variables are much like VC6.0's local
static variables - having an auxiliary flag that must be tested before
accessing the variable, even to generate a reference or pointer.  If
the flag is not set, the associated initialization function is called,
and the flag is set.  Cyclic initialization (recursive dependencies in
the initialization) is easily detected at run-time, but don't worry
about this for now.

Safe function: A safe function is disallowed from accessing any unsafe
variable or function.  An unsafe function has no restrictions on what
it may do, but as a consequence cannot be used during dynamic
initialization (except for the semi-safe relaxation).

Semi-safe: If the compiler can assure the order of initialization
inside a translation unit, it can use this fact to safely dynamically
initialize one variable from an "unsafe" variable.  For simplicity, I
suggest that order of initialization of unguarded dynamic
initialization be kept as being in the order of definition, and that an
"unsafe" variable or function can be treated as "semi-safe" if the only
variables and functions that it accesses are safe and semi-safe
variables and functions defined ABOVE it.  Semi-safe variables and
functions may be used in unguarded dynamic initialization.

We need to designate keywords to indicate non-default qualification of
variables and functions, presumably placed where the "extern" and
"static" keywords may be used.

And that is essentially all of it - save for analysis and dealing with
issues such as multi-threading and cyclic initialization.

Multi-threading requires a mutex in the initialization routine (this is
easily proved - there has to be a mechanism to halt all but one thread
until initialization is complete).  This in turn gives rise to the
issue of deadlocking, which I haven't thought through fully (though I
think it is only possible when multi-threaded applications have cyclic
guarded initialization, which will generate pathological behaviour on
every run (detection, stack overflow or deadlock).  Note that the mutex
should only be used AFTER the flag is tested, which means that the
execution overhead is only a test-and-jump once the variable has been
fully initialized.

Also, we should be able to get around the need for a mutex if we ensure
that every guarded variable is initialized before main() is called, AND
we somehow prevent runnable threads being started before dynamic
initialization is complete - i.e., ensure intialization in a
simgle-threaded context.

A side note: How many people are aware that local static variables are
typically implemented as what I call variables with guarded dynamic
initialization?  Except that I t think that a mutex is typically not
used - certainly not by VC6.0.  In this compiler, cyclic dependencies
lead to undefined behaviour, but no stack overflow (i.e. it is not
trivially diagnosed), and they are NOT thread safe.  Unfortunate timing
of thread switching can lead to double initialization and to use of the
pre-initialized state.  GCC fans: you might want to check whether GCC
uses a mutex.  If not, local static variables are either unsafe in for
use in dynamic initialization or multi-threaded contexts (which depends
upon the "when" of initialization selected by the compiler).

And finally, to give some insight, here is a macro providing guarded
dynamic initialization.  It returns a null reference when
initialization is cyclic (may be replaced by with a thrown exception).

// Macro use:
//    Macro invocation may be preceded by 'static' or 'extern'
qualifier.
//    May be used in both C and in C++.
//    Not to be used as function-local (not thread-safe).
//    Requirement to access dependencies in constructor left to class
designer.
//
//    Cyclic initialisation dependencies generate a bad reference.
//    Initialisation is on first use or before main (whichever is
earliest).
//    Macro parameters:
//     - "type" : declared type of
//     - "name" : the access name for the variable - suffix "()" when
using
//     - "pre"  : ";" or code to execute before evaluation of "init"
(may also declare variables)
//     - "init" : ";", "=..." or non-empty constructor parameter list
for initialisation of "name"
//                may be followed by ";" and post-initialisation code,
which may reference "name" (no parentheses)
//     - avoid partial statements in "pre" and "init".
//     - avoid reference to _pX and in "pre" and "init".
//     - take care with self-reference in the initialiser (not checked;
undereferenced self-pointers can make sense).

#define INIT(type,name,pre,init)       \
   type & name ()                      \
   {                                   \
      static type * _pX = (type *)-2;  \
      if ( _pX == (type *)-2 )         \
      {                                \
         _pX = (type *)-1;             \
         pre                           \
         static type name init;        \
         if ( _pX == (type *)-1 )      \
            _pX = &name;               \
      }                                \
      else if ( _pX == (type *)-1 )    \
         _pX = (type *)0;              \
      return *_pX;                     \
   }                                   \
   static bool _init_##name = (&name() != 0);

Example:

INIT(int,x,;,=y();)
static INIT(int,y,;,=3)
main() {
   int z = x();
   x() += 1;
}

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: nagle@animats.com (John Nagle)
Date: Wed, 10 May 2006 18:38:13 GMT
Raw View
Manfred wrote:
> The C++ standard (what I have seen of it) - and probably ditto C -
> provides very little control over the order in which dynamic
> initialization of objects of static storage duration ("static") occurs.
>  In particular, when two static objects in different translations units
> are to be initialized at run-time, and the initialization of one
> depends on the second one, there seems to be no way to control the
> order of initialization, and the behaviour is thus undefined.  Many
> programmers aren't even aware of the problem until it bites them, yet
> it is difficult to program some classes so that they do not use of
> static variables in their initialization (in the contructor or in the
> parameters to the constructor).  If I have it straight, this is a
> pretty nasty defect in the language(s).

     Modula II got this right.  It's the responsibility of the Modula II
linker to work out a safe order of static initialization by following
the call graph.  If the situation is such that no such order can be
found, the link fails with an error.

     The "don't change the linker" tradition of C and C++ (which exists
because the linker for UNIX on the PDP-11 was written in assembler with a
total of three comments) prevents this.

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]