Topic: Controversy and debate: Uninitialized variables by


Author: Matthew Fioravante <fmatthew5876@gmail.com>
Date: Tue, 17 Jun 2014 08:49:57 -0700 (PDT)
Raw View
------=_Part_2231_11979303.1403020197248
Content-Type: text/plain; charset=UTF-8

This is going to be controversial, but lets get into it.

The fact that `int x;` creates an uninitialized integer instead of a
default initialized one is completely wrong. I consider it a major defect
in the language, a bug in the standard.

One of my favorite C++11 API's is the atomic API. The reason is that it
does the right thing when it comes to correctness vs speed. By default, the
path of least resistance is to have everything sequentially consistent
which is the safest but also slowest memory ordering. If you know what
you're doing and need some speed, you can opt to use less safe orderings.
Even better, every use of the less safe ordering is tagged right there in
the source code (without a comment)! It really doesn't get much better than
that in terms of API design. It follows Scott Meyers maxim of "Making
interfaces easy to use correctly and hard to use incorrectly" perfectly.

Initialization in C++ is the exact opposite of this. The easiest thing to
do is write bugs by forgetting to initialize things. How many of you have
spent long debugging sessions only to track the source to an uninitialized
variable? Not only that, but initialization in C++ is a horrible mess. lets
look at the list:

Default initialization
Value Initialization
Copy Initialization
Direct Initialization
Aggregate Initialization
List initialization
Reference Initialization
Constant Initialization

Do you know by memory how all of those work and their gotchas? I sure as
hell don't and I pity the novice developer who tries. Do we need this level
of complexity?

Here is a sketch one possible way we could solve this problem:

    int x; //<-default initialized to 0
    int x = void; //<-uninitialized, I know what I'm doing
    volatile int x; //<-uninitialized, because we can't introduce
additional writes to volatile variables or break existing device driver
code.

Like the atomic API, the default action is the safe action, with options to
remove the restraints should you need it.
What I'm suggesting is that everything be initialized by default. And yes
that means all legacy code that flips the switch to use the next version of
the C++ language. If someone has a compelling performance argument for
leaving something uninitialized, the = void syntax is there for them.

What are the advantages:
1) Safe by default, if someone does a statistical survey, I'm confident
that after this change the average amount of time spent debugging C++ code
will go down.
2) Dangerous places are marked (=void) as such, bringing attention and
carefully scrutiny by the person reading code.
3) Less boilerplate code. Particularly with constructors I don't have to
write a bunch of stupid initialization code for my ints, floats, and
pointers.
4) Initialization behavior matches static and global variables. One less
"except when" for Herb Sutter to write about in GotW.

Counter arguments:
1) This will slow down everyone's programs!

No it won't. Compilers have been doing something called constant
propagation for over 20 years.

That is, this code:

    int x = 0;
    x = 1;

Will be optimized to this:
    int x = 1;

2) This will break C compatibility!

No it won't. extern "C" code will still have the old behavior. There is no
C breakage here. The data being passed to and from C code is still the
same, regardless of whether or not it was initialized by the compiler.

3) Why do we need this? Compilers, static checkers, and debugging tools can
detect uninitialized use!

Not always, and these tools are not always available. For example valgrind
is unusable on large resource consuming code bases. Also, even if there is
a tool why am I wasting my time checking this crap? I'd rather not be able
to easily write these bugs in the first place. Fix this and one *major*
class of bugs in C++ go away forever.

4) It will break legacy code!

In some cases yes, but lets take a deeper look at the possibilities here:

There are legacy code bases with real uninitialized variable bugs in them
today. Your company probably has 1 or 2 in their large code base and
miraculously its still working fine. If all of the sudden these things get
fixed, it may change the behavior of your program, causing a "bug" in the
sense that production is now operating differently. What used to be
undefined behavior just got defined. I don't see this as a huge problem. If
you have bugs in your code they need to be fixed. Also I do not believe
this is a good enough reason to continue the subpar status quo forever.

Then there may be other cases, for example some kind of strange embedded
code or device drivers. Perhaps you instantiate an object over top of a
hardware memory address. If you are doing this, you are probably also
marking your variables as volatile, and that as I proposed above is still
uninitialized so you won't be affected.

Maybe you're doing something really funky like putting your call stack on
some special memory, and default initialization will cause additional
writes which will cause your program to fail.  If you're doing this kind of
crazy low level stuff, then you should know enough to be able to fix your
code. Also you will be now annotating these instances with = void or
volatile and that has the additional benefit of saying in your code,
without a comment "*hey I'm doing some funny stuff here with initialization*
".

5) I'm so good I don't write these kinds of bugs

Congratulations, good job. Your colleagues however do write these kind of
bugs and will continue to do so until the end of time. Sometimes you may
even get to debug for them.

I really enjoy C++, for all of its warts. I believe unlike other languages
which come and go, C++ has staying power and will be around and growing for
a long time. C++ is the fastest and most versatile language on the planet.
I don't see everyone jumping ship anytime soon for a complete rewrite
language like D (sorry Andrei). We're stuck with C++, so lets make it a
better language.

Doing something like this would be huge. It would require a lot of analysis
to get right. My simple idea of =void (inspired from D, thanks Andrei) may
not work in all cases and will need to be fleshed out further.

Please now, convince if you can why this is a bad idea. Why should we
continue to inflict wasted hours of debugging sessions for these kinds of
silly easy to write bugs on the future of C++? Can you think of any
possible reason uninitialized by default is good other than "maintaining
legacy code".



--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_2231_11979303.1403020197248
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">This is going to be controversial, but lets get into it.<b=
r><br>The fact that `int x;` creates an uninitialized integer instead of a =
default initialized one is completely wrong. I consider it a major defect i=
n the language, a bug in the standard.<br><br>One of my favorite C++11 API'=
s is the atomic API. The reason is that it does the right thing when it com=
es to correctness vs speed. By default, the path of least resistance is to =
have everything sequentially consistent which is the safest but also slowes=
t memory ordering. If you know what you're doing and need some speed, you c=
an opt to use less safe orderings. Even better, every use of the less safe =
ordering is tagged right there in the source code (without a comment)! It r=
eally doesn't get much better than that in terms of API design. It follows =
Scott Meyers maxim of "Making interfaces easy to use correctly and hard to =
use incorrectly" perfectly.<br><br>Initialization in C++ is the exact oppos=
ite of this. The easiest thing to do is write bugs by forgetting to initial=
ize things. How many of you have spent long debugging sessions only to trac=
k the source to an uninitialized variable? Not only that, but initializatio=
n in C++ is a horrible mess. lets look at the list:<br><br>Default initiali=
zation<br>Value Initialization<br>Copy Initialization<br>Direct Initializat=
ion<br>Aggregate Initialization<br>List initialization<br>Reference Initial=
ization<br>Constant Initialization<br><br>Do you know by memory how all of =
those work and their gotchas? I sure as hell don't and I pity the novice de=
veloper who tries. Do we need this level of complexity?<br><br>Here is a sk=
etch one possible way we could solve this problem:<br><br>&nbsp;&nbsp;&nbsp=
; int x; //&lt;-default initialized to 0<br>&nbsp;&nbsp;&nbsp; int x =3D vo=
id; //&lt;-uninitialized, I know what I'm doing<br>&nbsp;&nbsp;&nbsp; volat=
ile int x; //&lt;-uninitialized, because we can't introduce additional writ=
es to volatile variables or break existing device driver code.<br><br>Like =
the atomic API, the default action is the safe action, with options to remo=
ve the restraints should you need it.<br>What I'm suggesting is that everyt=
hing be initialized by default. And yes that means all legacy code that fli=
ps the switch to use the next version of the C++ language. If someone has a=
 compelling performance argument for leaving something uninitialized, the =
=3D void syntax is there for them.<br><br>What are the advantages:<br>1) Sa=
fe by default, if someone does a statistical survey, I'm confident that aft=
er this change the average amount of time spent debugging C++ code will go =
down.<br>2) Dangerous places are marked (=3Dvoid) as such, bringing attenti=
on and carefully scrutiny by the person reading code.<br>3) Less boilerplat=
e code. Particularly with constructors I don't have to write a bunch of stu=
pid initialization code for my ints, floats, and pointers.<br>4) Initializa=
tion behavior matches static and global variables. One less "except when" f=
or Herb Sutter to write about in GotW.<br><br>Counter arguments:<br>1) This=
 will slow down everyone's programs!<br><br>No it won't. Compilers have bee=
n doing something called constant propagation for over 20 years.<br><br>Tha=
t is, this code:<br><br>&nbsp;&nbsp;&nbsp; int x =3D 0;<br>&nbsp;&nbsp;&nbs=
p; x =3D 1;<br><br>Will be optimized to this:<br>&nbsp;&nbsp;&nbsp; int x =
=3D 1;<br><br>2) This will break C compatibility!<br><br>No it won't. exter=
n "C" code will still have the old behavior. There is no C breakage here. T=
he data being passed to and from C code is still the same, regardless of wh=
ether or not it was initialized by the compiler.<br><br>3) Why do we need t=
his? Compilers, static checkers, and debugging tools can detect uninitializ=
ed use!<br><br>Not always, and these tools are not always available. For ex=
ample valgrind is unusable on large resource consuming code bases. Also, ev=
en if there is a tool why am I wasting my time checking this crap? I'd rath=
er not be able to easily write these bugs in the first place. Fix this and =
one <b>major</b> class of bugs in C++ go away forever.<br><br>4) It will br=
eak legacy code!<br><br>In some cases yes, but lets take a deeper look at t=
he possibilities here:<br><br>There are legacy code bases with real uniniti=
alized variable bugs in them today. Your company probably has 1 or 2 in the=
ir large code base and miraculously its still working fine. If all of the s=
udden these things get fixed, it may change the behavior of your program, c=
ausing a "bug" in the sense that production is now operating differently. W=
hat used to be undefined behavior just got defined. I don't see this as a h=
uge problem. If you have bugs in your code they need to be fixed. Also I do=
 not believe this is a good enough reason to continue the subpar status quo=
 forever.<br><br>Then there may be other cases, for example some kind of st=
range embedded code or device drivers. Perhaps you instantiate an object ov=
er top of a hardware memory address. If you are doing this, you are probabl=
y also marking your variables as volatile, and that as I proposed above is =
still uninitialized so you won't be affected.<br><br>Maybe you're doing som=
ething really funky like putting your call stack on some special memory, an=
d default initialization will cause additional writes which will cause your=
 program to fail.&nbsp; If you're doing this kind of crazy low level stuff,=
 then you should know enough to be able to fix your code. Also you will be =
now annotating these instances with =3D void or volatile and that has the a=
dditional benefit of saying in your code, without a comment "<i>hey I'm doi=
ng some funny stuff here with initialization</i>".<br><br>5) I'm so good I =
don't write these kinds of bugs<br><br>Congratulations, good job. Your coll=
eagues however do write these kind of bugs and will continue to do so until=
 the end of time. Sometimes you may even get to debug for them.<br><br>I re=
ally enjoy C++, for all of its warts. I believe unlike other languages whic=
h come and go, C++ has staying power and will be around and growing for a l=
ong time. C++ is the fastest and most versatile language on the planet.&nbs=
p; I don't see everyone jumping ship anytime soon for a complete rewrite la=
nguage like D (sorry Andrei). We're stuck with C++, so lets make it a bette=
r language.<br><br>Doing something like this would be huge. It would requir=
e a lot of analysis to get right. My simple idea of =3Dvoid (inspired from =
D, thanks Andrei) may not work in all cases and will need to be fleshed out=
 further.<br><br>Please now, convince if you can why this is a bad idea. Wh=
y should we continue to inflict wasted hours of debugging sessions for thes=
e kinds of silly easy to write bugs on the future of C++? Can you think of =
any possible reason uninitialized by default is good other than "maintainin=
g legacy code".<br><br><br><br></div>

<p></p>

-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2231_11979303.1403020197248--

.