Topic: initialization static objects in libraries
Author: clamage@eng.sun.com (Steve Clamage)
Date: 1998/01/12 Raw View
"J Scott Peter XXXIII i/iii" <scotty@cinenet.dot.net> writes:
>Michael R Cook wrote in message <68e4e6$orr$1@cognex.cognex.com>...
>>A slightly modified solution is this:
>>
>> SomeClass& SomeClass::obj()
>> {
>> static SomeClass theObj(args);
>> return theObj;
>> }
>> static SomeClass* init = &SomeClass::obj();
>>
>>The `init' pointer exists only to force theObj to be constructed
>>during static initialization.
>This is not a bad solution, but in most implementations, results in overhead
>every time you call SomeClass::obj(), because a static local variable
>results in the function checking a flag every time it is called.
>The following scheme results in automatic initialisation in the correct
>order, and no overhead in accessing the variable thereafter:
> // SomeClass.hpp
> extern SomeClass* SomePtr; // Single object defined in SomeClass.cpp
> class SomeClassInit {
> public:
> SomeClassInit()
> {
> if (!SomePtr)
> SomePtr = new SomeClass;
> }
> ~SomeClassInit()
> {
> delete SomePtr;
> SomePtr = 0;
> }
> };
> static SomeClassInit initSomeClass;
>Notice that SomeClassInit has no data members, and a static instance of it
>is declared in the .hpp file, resulting in a static instance defined in
>every module the .hpp is included in. Since order of initialisation is
>guaranteed to be declaration order *within any module*, this results in
>every module which includes SomeClass.hpp having the initialisation of
>SomePtr guaranteed before use.
It isn't necessarily a better solution. The main problem occurs for
large programs, which is when you most run into the order of init
problem.
If you have 1000 modules, you get 1000 attempted initializations, each
cheap in itself. But each of these attempted inits is in a different
part of the program -- a different code page gets pulled in for each
one before the main program starts to run, destroying program locality.
(And the same at program exit, of course.)
Real programs on real systems using this technique have seen program
startup (and shutdown) take a minute or more just to do the many wasted
code page accesses. (We're assuming the program is large compared
to the working set size.)
Implementations could ease this problem by putting all the static
intializers in the same address section, but that isn't always an option.
It's also worth mentioning a problem common to both solutions: you
don't know when the object will be destroyed; consequently, it
might be destroyed too soon. Within one module, a static destructor
might require SomePtr, but the SomePtr object might have been
destroyed from some other module whose static destructors ran first.
The ARM (page 19, I think) presents the "nifty counter" solution, of
which the second example above is a simplification. The ARM
solution counts the number of attempted initializations of SomePtr,
and decrements the counter on each attempted destruction. Only when
the counter goes to zero is the object destroyed.
If you were going to use the second solution above, you would be
better advised to use the "nifty counter" solution. There is no
point in incurring its overhead without also getting its benefits.
There is no perfect solution in C++ to the order of initialization
problem. The nifty counter solution eliminates access overhead
at the expense of increased startup and shutdown time. Variations
of the first method above add one indirection to each access, but
add no startup or shutdown overhead; they also assume that it doesn't
matter when at program end the object is destroyed. For a given
program, one or the other might be a suitable choice.
The C++ Committee spent quite a lot of time searching for a way
to solve this problem by language additions, but failed. Every
suggested solution was either far too difficult to use, or failed
to solve the common problems for which it was most needed.
Of course, if you don't use global static objects, you don't run
into the problem. Avoiding global static objects is not bad
advice. That is, you should prove that no better solution exists
before resorting to them.
--
Steve Clamage, stephen.clamage@sun.com
---
[ comp.std.c++ is moderated. To submit articles: try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html ]
[ Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu ]
Author: "J Scott Peter XXXIII i/iii" <scotty@cinenet.dot.net>
Date: 1998/01/11 Raw View
Michael R Cook wrote in message <68e4e6$orr$1@cognex.cognex.com>...
>A slightly modified solution is this:
>
> SomeClass& SomeClass::obj()
> {
> static SomeClass theObj(args);
> return theObj;
> }
> static SomeClass* init = &SomeClass::obj();
>
>The `init' pointer exists only to force theObj to be constructed
>during static initialization.
This is not a bad solution, but in most implementations, results in overhead
every time you call SomeClass::obj(), because a static local variable
results in the function checking a flag every time it is called.
The following scheme results in automatic initialisation in the correct
order, and no overhead in accessing the variable thereafter:
// SomeClass.hpp
extern SomeClass* SomePtr; // Single object defined in SomeClass.cpp
class SomeClassInit {
public:
SomeClassInit()
{
if (!SomePtr)
SomePtr = new SomeClass;
}
~SomeClassInit()
{
delete SomePtr;
SomePtr = 0;
}
};
static SomeClassInit initSomeClass;
Notice that SomeClassInit has no data members, and a static instance of it
is declared in the .hpp file, resulting in a static instance defined in
every module the .hpp is included in. Since order of initialisation is
guaranteed to be declaration order *within any module*, this results in
every module which includes SomeClass.hpp having the initialisation of
SomePtr guaranteed before use.
[ comp.std.c++ is moderated. To submit articles: try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html ]
[ Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu ]
Author: Michael R Cook <michael_cook%erawan@cognex.com>
Date: 1997/12/31 Raw View
>>>>> "SC" == Steve Clamage <stephen.clamage_nospam@eng.Sun.COM> writes:
SC> The local static "theObj" is initialized the first time function "obj"
SC> is called, no matter when that occurs. It is automatically destroyed
SC> at program end, just as if it were a global static as in the original
SC> code.
The problem with this solution is that theObj might get constructed
during static destruction, the behavior of which (IIRC) is
undefined.
A slightly modified solution is this:
SomeClass& SomeClass::obj()
{
static SomeClass theObj(args);
return theObj;
}
static SomeClass* init = &SomeClass::obj();
The `init' pointer exists only to force theObj to be constructed
during static initialization.
(PITA means "pain in the arse", BTW.)
Michael.
---
[ comp.std.c++ is moderated. To submit articles: Try just posting with your
newsreader. If that fails, use mailto:std-c++@ncar.ucar.edu
comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu
]
Author: Dean Roddey <droddey@charmedquark.com>
Date: 1997/12/26 Raw View
Eric Tse wrote:
> Hi,
>
> I have a question on should all static objects in the libraries which an
> application links be initialization. I have observed that VC++5.0 only
> initialize those static objects which is referred by the application
> code. This probably is not good enough for me.
>
> Thanks,
> Eric
You really should not depend upon global initialization anyway. In any non-trivial
system, you are likely to get into trouble because of the fact that you cannot
forsee the connections between the global objects, and because the language
provides no mechanism to control the order of global objects across multiple
files.
In the end, the best (but kind of PITA) and safest way to deal with globals is to
lazily evaluate them upon demand. So, do something like this:
class TFoo
{
public :
TFoo();
~TFoo();
static SomeObj& GetSomeObj();
private :
SomeObj* pSomeObj;
};
SomeObj& TFoo::GetSomeObj()
{
if (!pSomeObj)
pSomeObj = new SomeClass(blah blah blah);
return *pSomeObj;
}
So you provide a getter method that will allocate the object when its actually
needed. So all access to the object goes through the getter, insuring that its
never created if its never needed and that any access to it causes it to be
created. You could even make it safer by doing something like this:
class TFoo
{
public :
TFoo();
~TFoo();
static SomeObj& GetSomeObj();
};
SomeObj& TFoo::GetSomeObj()
{
static SomeObj* pSomeObj;
if (!pSomeObj)
pSomeObj = new SomeClass(blah blah blah);
return *pSomeObj;
}
So this one gets the pointer into the getter method itself, insuring absolutely
that it cannot be accessed except through that getter object. From the outside
world's point of view, its still part of the TFoo class since it cannot be
accessed any other way. When you are doing globals that are just for use within a
single module, this latter scheme is definitely the easiest way to go. Just have a
local, static method which provides access to a global inside itself.
One issue that you have to deal with is multi-threaded access. Since the objects
are not always created during DLL initialization (which is single threaded), that
means that they might not get accessed until later when multiple threads could be
running. The easiest way to deal with this is to have a single, primal critical
section (or mutex or whatever you have available) that is used for such one time
initialization.
In my class libraries, there is a TBaseLock class that is just for that purpose.
So, in my system, it would look like this:
SomeObj& TFoo::GetSomeObj()
{
static SomeObj* pSomeObj;
if (!pSomeObj)
{
TBaseLock lockInit;
if (!pSomeObj)
pSomeObj = new SomeClass(blah blah blah);
}
return *pSomeObj;
}
So in this version, the pointer is checked and, if its zero, then a base lock
object is created. That lock is basically a simple, on the stack object that locks
a single, primal lock. When it goes out of scope, it unlocks that primal lock.
This insures that the initialization is serialized.
Note that there is a second check of the pointer, because another thread could
have gotten in after the initial check but before we got the lock.
The primal lock of course is kind of a chicken and egg situation. So it is also
lazily created, but its done using an atomic compare and swap to insure that its
creation is synchronized without using any other higher level mechanism. So that's
how the system is bootstrapped up and, once that primal lock is created, then any
other one time global initialization stuff is easy to serialize.
Here is the code for the base lock class from my system:
TBaseLock::TBaseLock()
{
if (!__pcrsLock)
{
TCriticalSection* pcrsCandidate = new TCriticalSection;
if (TRawMem::pCompareAndExchangePtr
(
__pcrsLock
, pcrsCandidate
, 0))
{
delete pcrsCandidate;
}
}
__pcrsLock->Enter();
}
TBaseLock::~TBaseLock()
{
__pcrsLock->Exit();
}
Notice how the compare and swap requires that you create an object first, then try
to swap it in. If the swap fails (because someone beat us to the punch, then we
have to delete the candidate we created. This only happens during contention.
Usually, the thread that first checks the pointer wins or the pointer is already
non-zero. So it probably only going to happen once (or worst case a couple times
on a multi-CPU machine) during a run of the program. I'm using my own compare and
swap (and a templatized one in this case to make it much more clean looking), but
most runtimes or OS APIs will provide you access to some kind of compare and swap
API (or you can do a simple ASM module to provide it yourself if necessary.) Since
its all hidden away and encapsulated, that's easy enough to deal with.
There are some gotchas here. The main one is that, unlike with global objects,
these objects will not automatically destruct on program termination. That's good
and bad. Its good because the same problems occur with global destruction as with
global initialization, mainly that there is no way to control their order so they
can cause wierd side effects on the way down. Its bad in some rare cases because
the program might depend upon some object's destruction to cause some clean up.
But, there are other ways to handle such things without depending upon a
questionable mechanism like global destruction.
The other gotcha is that lazy initialization causes spurious reports of memory
leaks by some tools. They don't know that the newly created objects were purposely
created and stored away in global pointers to be managed totally correctly. That
can kind of be a problem, but having a proveably correct init/term scenario is
more important in my opinion. You can always, during testing, set up your tests so
that you run them once without leak testing, then run them a second time with it
on. Any leaks during the second run should be legitimate.
And, in this situation as with the use of global objects, two mutually referential
objects can cause a problem. In the global objects situation, the results can be
non-obvious since one object will construct before another that it depends upon
will be constructed. That can make for very difficult to deal with problems where
unitialized objects are accessed. In my scheme here, you could dead lock because
object A's ctor is trying to access object B (thus forcing its construction) and
then object B (during its construction) tries to access object A (which is already
constructing.) They will both try to lock the base lock and deadlock.
Probably the best way to deal with this is to put a counter in the base lock class
that catches attempts to reenter. Such a reentry indicates a bad mutual reference
that cannot be safe. It can log an error and terminate. It could probably be a
'debug only' type of conditional check but it could probably be left in for
production code just in case since it would not require a lot of overhead (since
the primal lock already provides the mechanism to synchronize access to the
counter.) That's on my list of things to do for the new release. It provides a
much more structured way to catch bad mutual references than just a wierd use of
uninitialized objects. In my system, since such a termination produces a stack
dump I will know exactly what two classes where in error because their
constructors will be in the stack dump along with the error saying that a mutual
reference during construction occured.
I know that was a pretty long ramble but hopefully it helped. Look at my class
libraries if you want to see how it all works.
--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com
"Software engineers are, in many ways, similar to normal people"
---
[ comp.std.c++ is moderated. To submit articles: try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html ]
[ Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu ]
Author: "Brock" <peabody@npcinternational.com>
Date: 1997/12/29 Raw View
I saw something like this in More Effective C++:
X& get_global_x() {
static X global_x;
return global_x;
}
This seems the easiest way to grant global access to a single object that is
not initialized until its first use. Notice that this method uses no
dynamic memory. Mr. Meyers also warns not to make this function inline or
you may end up with multiple global_x's.
---
[ comp.std.c++ is moderated. To submit articles: Try just posting with your
newsreader. If that fails, use mailto:std-c++@ncar.ucar.edu
comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu
]
Author: Dean Roddey <droddey@charmedquark.com>
Date: 1997/12/30 Raw View
Brock wrote:
> I saw something like this in More Effective C++:
>
> X& get_global_x() {
>
> static X global_x;
>
> return global_x;
> }
>
> This seems the easiest way to grant global access to a single object that is
> not initialized until its first use. Notice that this method uses no
> dynamic memory. Mr. Meyers also warns not to make this function inline or
> you may end up with multiple global_x's.
> ---
This is, unfortunately, not thread safe. So you will still have to provide some
sort of synchronization on it. Once you throw that into the mix, then having the
pointer and dynamically allocating it tends to be the least of various evils.
--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com
"Software engineers are, in many ways, similar to normal people"
---
[ comp.std.c++ is moderated. To submit articles: Try just posting with your
newsreader. If that fails, use mailto:std-c++@ncar.ucar.edu
comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu
]
Author: stephen.clamage_nospam@eng.Sun.COM (Steve Clamage)
Date: 1997/12/30 Raw View
On 26 Dec 97 03:10:59 GMT, Dean Roddey <droddey@charmedquark.com>
wrote:
>Eric Tse wrote:
>>
>> I have a question on should all static objects in the libraries which an
>> application links be initialization. I have observed that VC++5.0 only
>> initialize those static objects which is referred by the application
>> code. This probably is not good enough for me.
>
>You really should not depend upon global initialization anyway. In any non-trivial
>system, you are likely to get into trouble because of the fact that you cannot
>forsee the connections between the global objects, and because the language
>provides no mechanism to control the order of global objects across multiple
>files.
>
>In the end, the best (but kind of PITA) and safest way to deal with globals is to
>lazily evaluate them upon demand.
That is good advice (although I don't know what "PITA" means).
> So, do something like this:
>
>class TFoo {
> public :
> TFoo();
> ~TFoo();
> static SomeObj& GetSomeObj();
>};
>
>SomeObj& TFoo::GetSomeObj()
>{
> static SomeObj* pSomeObj;
>
> if (!pSomeObj)
> pSomeObj = new SomeClass(blah blah blah);
>
> return *pSomeObj;
>}
Two problems with this solution: The object is not automatically
destroyed on program exit, and you have the overhead of a heap
allocation. The original code did not have either problem. If you make
the object a local static object instead of a heap object, you regain
all the benefits of a static object, but avoid the uncertainty about
when it gets constructed. That is, if the original code looked like
this
SomeClass obj(args);
you can replace it with code like this
SomeClass& obj()
{
static SomeClass theObj(args);
return theObj;
}
The local static "theObj" is initialized the first time function "obj"
is called, no matter when that occurs. It is automatically destroyed
at program end, just as if it were a global static as in the original
code. Like Dean's example, it still has the overhead of one function
call per object access. (You could possibly make function "obj" inline
and avoid that overhead, but that in turn depends on whether your
compiler follows the new rules about static objects in inline
functions. At the moment, I wouldn't count on it.)
The original question asked what guarantees there are about
initialization of static objects in libraries. If a static object is
part of the program, it gets initialized according to the rules for
initializing static objects. But is an arbitrary object that happens
to be in a library really part of the program? That detail is
unspecified, and you can't make any portable assumptions.
For that reason it is better not to put global static objects into
libraries. If you use a method similar to what I outlined, you are
assured the object will be part of the program if it is referenced,
and initialized the first time it is referenced.
If you want an unreferenced object in a library to be part of the
program anyway, you'll need to use some implementation-specific method
to force it into the program, or modify the library design to force it
to be referenced.
---
Steve Clamage, stephen.clamage_nospam@eng.sun.com
( Note: remove "_nospam" when replying )
---
[ comp.std.c++ is moderated. To submit articles: Try just posting with your
newsreader. If that fails, use mailto:std-c++@ncar.ucar.edu
comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu
]
Author: Eric Tse <etse@scdt.intel.com>
Date: 1997/12/24 Raw View
Hi,
I have a question on should all static objects in the libraries which an
application links be initialization. I have observed that VC++5.0 only
initialize those static objects which is referred by the application
code. This probably is not good enough for me.
Thanks,
Eric
---
[ comp.std.c++ is moderated. To submit articles: try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html ]
[ Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu ]