Thread

Topic: Irrational Rules?

Author: "kanze" <kanze@gabi-soft.fr>
Date: Wed, 17 May 2006 11:48:27 CST Raw View

kuyper@wizard.net wrote:
> Francis Glassborow wrote:
> > In article <1147752080.707405.193860@v46g2000cwv.googlegroups.com>,
> > kuyper@wizard.net writes
> > >int main(void)
> > >{
> > >    unsigned char c;
> > >    return c-c;
> > >}

> > >That program would have well-defined behavior. It's a
> > >pointlessly complicated way of doing things, but it would
> > >be well-defined (In C99, such code is strictly conforming,
> > >which imposes much stricter requirements than merely
> > >"well-defined").

> > I am not convinced that it is strictly conforming. Are you
> > claiming it should always return 0? In reality it will
> > because the compiler will optimise away all the code and
> > simply convert it to reporting a successful execution. But
> > in theory c is indeterminate and a response to a DR in
> > Sydney was that indeterminate actually means that the value
> > is permitted to change momentarily to any other value in the
> > set of permitted values for an unsigned char.

> I was unaware of that DR, but as I said, I'm not arguing that
> code like this is a good idea, so I have no objection to that
> decision. I can't think of any legitimate use for reading
> uninitialized memory (though I know of a few illegitimate uses
> people have made of it).

In practice, I think that an implementation will have to support
copying it.  Things like:

    struct
    {
        int                 count ;
        int                 values[ 1000 ] ;
    } ;

or

    struct
    {
        wchar_t             id[ 40 ] ;
                // initialized with wstrcpy...
        //  ...
    } ;

are far too prevelent for an implementation to allow them to
fail.  (Think about the second case for awhile -- it's required
to work with char[], at least in C++.  Converting to use wchar_t
suddenly introduces undefined behavior?)

--
James Kanze                                           GABI Software
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: bop@gmb.dk ("Bo Persson")
Date: Wed, 17 May 2006 18:51:06 GMT Raw View

"Jeff Rife" <wevsr@nabs.net> skrev i meddelandet
news:MPG.1ed3ae90415e7c8c98a535@news.nabs.net...
>> (kuyper@wizard.net) wrote in comp.std.c++:
>
> Of *course* it would have negligible impact.  Anything currently
> undefined
> can't (by definition) be used in legal code.  Thus, changing the
> rules
> to make undefined behavior in some way legal (either unspecified or
> truly
> defined) obviously wouldn't impact any existing code.  It also
> obviously
> wouldn't be a big deal for the compiler writer, as they wouldn't
> have to
> change a thing to stay compliant, since I'm sure the standard would
> allow
> an implementation to emit a diagnostic as one of the choices of the
> unspecified behavior.

And of course, the problem of requiring initialization of all
variables is that it would make C++ code slower than the original C
code. NOT a good idea!

>
> But, if a diagnostic isn't required, it would lead to a lot of
> poorly
> written code that relies on implementation-specific behaviors.  I'm
> not
> anywhere close to the C++ committee, but I don't think this is they
> way
> they want to lean.
>
>> just as it is for C99 for the size-named types such as int32_t.
>
> ISTR these weren't defined by the standard before C99, so nobody was
> using
> them.  As such, giving them a variance that other, older types
> didn't have
> wouldn't break existing code.

The catch here is that C99 defines int32_t *only* for implementations
without padding bits. That is, the systems that have trap
representations in C++, will *not* have int32_t defined in C99.

Is this really solving the problem??


Bo Persson


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "kanze" <kanze@gabi-soft.fr>
Date: Wed, 17 May 2006 14:09:11 CST Raw View

Francis Glassborow wrote:

    [...]
> What I do think we need is some form of behaviour that allows,
> at worst, that a process will be aborted but does not permit
> immediate reformatting of your hard drive.

This is part of a general problem.  The C and C++ standards take
an all or nothing approach in a lot of places -- the code is
either legal, or you have undefined behavior.  Thus, for
example:

    #include <iostream>
    int
    main()
    {
        std::cout << "Hello, world" << std::endl ;
    }

has undefined behavior -- as far as the standard is concerned,
it could reformat your hard disk.  But we all know that in
practice, only two behaviors are possible (unless the
implementation goes way out of its way to cause us problems):
the code will work, or it will fail to compile.  There's a very
real need in the library for a category for this; it's
completely ridiculous to get undefined behavior because you
forget to include a header.  (Especially in a case like the
above, where almost all, if not all, implementations include it
indirectly from one of the headers you do include.)

--
James Kanze                                           GABI Software
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Wed, 17 May 2006 15:45:56 CST Raw View

"Bo Persson" wrote:
> "Jeff Rife" <wevsr@nabs.net> skrev i meddelandet
.
> And of course, the problem of requiring initialization of all
> variables is that it would make C++ code slower than the original C
> code. NOT a good idea!

Initialization of a variables would only be required if uninitialized
values could actually cause problems. If the requirement that
unitialized variables contain valid values were restricted to unsigned
integer types, do you know of any architectures where that would be a
problematic requirement?

> Is this really solving the problem??

As far as I know, there's no real problem to be solved here. This
started with the claim that the relevant rule was irrational, but I
don't think anyone has suggested that this "irrational" rule actually
causes a problem.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "mark" <markw65@gmail.com>
Date: Wed, 17 May 2006 17:10:34 CST Raw View

Jeff Rife wrote:
> (kuyper@wizard.net) wrote in comp.std.c++:
> > > Well, of course it doesn't *have* to have undefined behavior.  The C++
> > > standard could have said something like "all objects not explicitly
> > > initialized by code will be default initialized by the implementation".
> >
> > That's not what I meant. I meant that for unsigned types, at least, the
> > behavior could be made unspecified, rather than undefined, with
> > negligible impact on virtually all existing implementations of C++,
>
> Of *course* it would have negligible impact.  Anything currently undefined
> can't (by definition) be used in legal code.  Thus, changing the rules
> to make undefined behavior in some way legal (either unspecified or truly
> defined) obviously wouldn't impact any existing code.  It also obviously
> wouldn't be a big deal for the compiler writer, as they wouldn't have to
> change a thing to stay compliant, since I'm sure the standard would allow
> an implementation to emit a diagnostic as one of the choices of the
> unspecified behavior.

A compiler is allowed to assume that undefined behavior does not occur.
It cannot make the same assumption about unspecified behavior.

This change could easily make a conforming implementation
non-conforming. (I know of at least one that would be affected).

Mark Williams

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: hyrosen@mail.com (Hyman Rosen)
Date: Thu, 18 May 2006 05:58:34 GMT Raw View

kuyper@wizard.net wrote:
> As far as I know, there's no real problem to be solved here. This
> started with the claim that the relevant rule was irrational, but I
> don't think anyone has suggested that this "irrational" rule actually
> causes a problem.

In response to this topic, I have previously posted a data structure
which depends on this behavior for efficiency. It looks like this:

template <unsigned N>
struct set
{
     unsigned count, dense[N], sparse[N];
     set() : count(0) { }
     bool contains(unsigned k) const {
         return k < N && sparse[k] < N && dense[sparse[k]] == k;
     }
     void insert(unsigned k) {
         if (k < N && !contains(k)) {
             dense[count] = k;
             sparse[k] = count++;
         }
     }
     void delete(unsigned k) {
         if (contains(k)) {
             unsigned i = sparse[k];
             dense[i] = dense[--count];
             sparse[dense[i]] = i;
         }
     }
     void clear() {
         count = 0;
     }
     set(const set &o) : count(0) {
         for (unsigned i = 0; i < o.count; ++i)
             insert(o.dense[i]);
     }
     set &operator=(const set &o) {
         if (this != &o) {
             clear();
             for (unsigned i = 0; i < o.count; ++i)
                 insert(o.dense[i]);
         }
         return *this;
     }
};

Notice that the dense and sparse arrays never need to be
initialized, but the membership test depends upon being
able to read consistent values out of uninitialized spots.
If N is large but count is typically small and these objects
are frequently created, then initializing the arrays could
waste lots and lots of time. (And large N with small count
is what this data structure is for.)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Thu, 18 May 2006 00:55:23 CST Raw View

Jeff Rife wrote:
> (kuyper@wizard.net) wrote in comp.std.c++:
.
> > That's not what I meant. I meant that for unsigned types, at least, the
> > behavior could be made unspecified, rather than undefined, with
> > negligible impact on virtually all existing implementations of C++,
>
> Of *course* it would have negligible impact.  Anything currently undefined
> can't (by definition) be used in legal code.  Thus, changing the rules
> to make undefined behavior in some way legal (either unspecified or truly
> defined) obviously wouldn't impact any existing code.

I was talking about existing implementations, not existing code.
Changing undefined behavior to defined behavior inherently means that
any implementation that doesn't currently provide that defined behavior
would have to be re-written to remain conforming, even if there isn't
any code currently making use of that feature.

When I say that the impact would be negligible, I'm saying that I think
that the number of implementations that don't already provide behavior
consistent with such a requirement is small. For the few
implementations where a rewrite would be required, it should be
relatively simple change.

> ...  It also obviously
> wouldn't be a big deal for the compiler writer, as they wouldn't have to
> change a thing to stay compliant, since I'm sure the standard would allow
> an implementation to emit a diagnostic as one of the choices of the
> unspecified behavior.

The standard allows issuing a diagnostic for any reason an
implementation chooses, so of course that would be legal. It's also
legal for an implementation to issue a diagnostic for

int main() {}

However, I'm not talking about an open-ended lack of specification for
the behavior; that's exactly what "undefined behavior" is. I'm talking
about an unspecified choice from a finite list of possibilities. In
this particular case, if the standard were changed to say that
uninitialized variables (at least of certain types) have an unspecified
but valid value, the finite list of possibilities is the finite number
of different values that variable could possess. Whichever one is
chosen, has to be printed out by the program. That's very different
from undefined behavior, which would allow the program to abort(), or
print out the address of the Vice President's secret bunker, or format
your hard disk.

> But, if a diagnostic isn't required, it would lead to a lot of poorly
> written code that relies on implementation-specific behaviors.  I'm not
> anywhere close to the C++ committee, but I don't think this is they way
> they want to lean.

I'm not suggesting that this is a good idea; as I've said before, I
can't concieve of any legitimate use for such a feature. I'm just
objecting to the claim that it has to be undefined behavior. It
doesn't, and I believe that the cost of implementing such a requirement
would be small.

> > just as it is for C99 for the size-named types such as int32_t.
>
> ISTR these weren't defined by the standard before C99, so nobody was using
> them.  As such, giving them a variance that other, older types didn't have
> wouldn't break existing code.

True. However, C99 also prohibits the character types from trapping,
and those did exist long before C99.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Thu, 18 May 2006 10:20:31 CST Raw View

Hyman Rosen wrote:
> kuyper@wizard.net wrote:
> > As far as I know, there's no real problem to be solved here. This
> > started with the claim that the relevant rule was irrational, but I
> > don't think anyone has suggested that this "irrational" rule actually
> > causes a problem.
>
> In response to this topic, I have previously posted a data structure
> which depends on this behavior for efficiency. It looks like this:
>
> template <unsigned N>
> struct set
> {
>      unsigned count, dense[N], sparse[N];
>      set() : count(0) { }
>      bool contains(unsigned k) const {
>          return k < N && sparse[k] < N && dense[sparse[k]] == k;

I think sparse[k]<N should be sparse[k]<count; otherwise, your set is
capable of claiming that it contains values that have never been
inserted into it, and clear()ing the set wouldn't reduce the number of
values it contains.

>      }
>      void insert(unsigned k) {
>          if (k < N && !contains(k)) {
>              dense[count] = k;
>              sparse[k] = count++;
>          }
>      }
>      void delete(unsigned k) {
>          if (contains(k)) {
>              unsigned i = sparse[k];
>              dense[i] = dense[--count];
>              sparse[dense[i]] = i;
>          }
>      }
>      void clear() {
>          count = 0;
>      }
>      set(const set &o) : count(0) {
>          for (unsigned i = 0; i < o.count; ++i)
>              insert(o.dense[i]);
>      }
>      set &operator=(const set &o) {
>          if (this != &o) {
>              clear();
>              for (unsigned i = 0; i < o.count; ++i)
>                  insert(o.dense[i]);
>          }
>          return *this;
>      }
> };
>
> Notice that the dense and sparse arrays never need to be
> initialized, but the membership test depends upon being
> able to read consistent values out of uninitialized spots.
> If N is large but count is typically small and these objects
> are frequently created, then initializing the arrays could
> waste lots and lots of time. (And large N with small count
> is what this data structure is for.)

With my suggested modification, I'm willing to concede that this is an
example of code that would benefit from such a change in the standard.

I'd be inclined to use a std::bitset<N> to keep track of which numbers
were contained in your set, which would take up a lot less memory, at
the cost of requiring initialization of that much smaller piece of
memory. I'm not sure how it would compare for speed.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Hyman Rosen <hyrosen@mail.com>
Date: Thu, 18 May 2006 11:18:06 CST Raw View

kuyper@wizard.net wrote:
> I'd be inclined to use a std::bitset<N> to keep track of which numbers
> were contained in your set, which would take up a lot less memory, at
> the cost of requiring initialization of that much smaller piece of
> memory. I'm not sure how it would compare for speed.

If you also need to iterate over the members, std::bitset<N>
will take time proportional to N while this data structure
takes only count steps. This structure is designed to optimize
for N >> count.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Hyman Rosen <hyrosen@mail.com>
Date: Thu, 18 May 2006 11:29:46 CST Raw View

Hyman Rosen wrote:
>     bool contains(unsigned k) const {
>         return k < N && sparse[k] < N && dense[sparse[k]] == k;
>     }

Oops, sorry. That should be
     bool contains(unsigned k) const {
         return k < N && sparse[k] < count && dense[sparse[k]] == k;
     }
                                     ^^^^^

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "ThosRTanner" <ttanner2@bloomberg.net>
Date: Fri, 12 May 2006 12:17:43 CST Raw View

"Tom   s" wrote:
> Let's say we have a 16-Bit unsigned integer.
>
> It can store 65 536 unique values.
>
> It can store from 0 to 65535 inclusive.
>
> It has no "invalid" bit patterns -- each of them is a valid number.
>
> So why does the following code exhibit Undefined Behaviour?:
>
> #include <iostream>
>
> int main()
> {
>     unsigned k;
>
>     std::cout << k;
> }
>
How do you know what the stack is initialised to? Is there any
requirement that an uninitialised variable should have the same value
in 2 different runs?


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "David R Tribble" <david@tribble.com>
Date: 13 May 2006 15:50:02 GMT Raw View

Tom   s wrote:
> Irrational Rule 1:
> With regards to unions, you can only read from the member which
> you last wrote to.

Yes, that is correct, because unions were designed to conserve space
by allowing multiple values (or differing types) to occupy the same
space, and for that space to be embued with only one value (and type)
at a time.  Thus you write a value of a certain type into the space,
and you can only read that same value and type out of the space.

Obviously, unions are used extensively as a "covert" way of working
around the type system.  But that's not how the standard language
defines their use.

On the other hand, every object can be viewed as a sequence of
unsigned chars (a.k.a. bytes), so that things like malloc(), new,
and memcpy() work as expected.

-drt

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Ron House <house@usq.edu.au>
Date: Sun, 14 May 2006 23:56:06 CST Raw View

NULL@NULL.NULL wrote:
> Let's say we have a 16-Bit unsigned integer.
>
> It can store 65 536 unique values.
>
> It can store from 0 to 65535 inclusive.
>
> It has no "invalid" bit patterns -- each of them is a valid number.
>
> So why does the following code exhibit Undefined Behaviour?:
>
> #include <iostream>
>
> int main()
> {
>     unsigned k;
>
>     std::cout << k;
> }

Because it might print 7, it might print 2916, it might print...

--
Ron House       house@usq.edu.au
                 http://www.sci.usq.edu.au/staff/house

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Ron House <house@usq.edu.au>
Date: Mon, 15 May 2006 00:07:14 CST Raw View

NULL@NULL.NULL wrote:

> Yes, but my point is that the following code should be harmless:
>
> int main()
> {
>     int *p = 0;
>
>     *p;

This works out the value of *p and throws it away. A compiler is
entitled to actually do it, because that is what you said to do, even if
most will optimise it away. I have no desire that compiler writers be
hampered so slovenly programmers can write rubbish. Code like that needs
to be found and eliminated, and I thank any standard that allows a
compiler to object.

--
Ron House       house@usq.edu.au
                 http://www.sci.usq.edu.au/staff/house

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Mon, 15 May 2006 15:43:00 GMT Raw View

Ron House posted:

> This works out the value of *p and throws it away. A compiler is=20
> entitled to actually do it, because that is what you said to do, even i=
f=20
> most will optimise it away. I have no desire that compiler writers be=20
> hampered so slovenly programmers can write rubbish. Code like that need=
s=20
> to be found and eliminated, and I thank any standard that allows a=20
> compiler to object.

With the advent of references, there's plenty of examples of where memory=
=20
need not be accessed even when using the dereference operator:

int &k =3D *new int;

return *this;


-Tom=E1s=20

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Mon, 15 May 2006 11:03:04 CST Raw View

Ron House wrote:
> NULL@NULL.NULL wrote:
> > Let's say we have a 16-Bit unsigned integer.
> >
> > It can store 65 536 unique values.
> >
> > It can store from 0 to 65535 inclusive.
> >
> > It has no "invalid" bit patterns -- each of them is a valid number.
> >
> > So why does the following code exhibit Undefined Behaviour?:
> >
> > #include <iostream>
> >
> > int main()
> > {
> >     unsigned k;
> >
> >     std::cout << k;
> > }
>
> Because it might print 7, it might print 2916, it might print...

You're missing the point. If that was the only issue, there's no reason
why that code couldn't have the standard-defined behavior of printing
out an unspecified value somewhere within the range 0 to UINT_MAX. When
the behavior is undefined, it allows many other options. It allows the
program to print out profanity, format your hard disk, or  more
probably in this case, to abort your program, possibly with a
diagnostic. The question is, why does 4.1p1 make the behavior
undefined?

Code like this doesn't have to have undefined behavior. The C standard,
for instance, doesn't handle this the same way. It says that the
initial value is indeterminate. An indeterminate value can be either an
unspecified element of the set of valid values, or a trap
representation. Any program that attempts to use the value of an object
cotaining a trap representation has undefined behavior. So far, it
sounds like a more complicated way of saying exactly the same thing,
but here's the key difference: the C99 standard gurantees that some
types, including unsigned char and the size-named types that were
introduced in C99, do not have any trap representations. The standard
seems to allow signed char and plain char (if signed) to have trap
representations, but if so, they are prohibited them from trapping.
Therefore, while code equivalent to the above, but using printf()
instead of cout<< would still have undefined behavior in C, a very
minor change would change that from undefined to unspecified: replace
"unsigned" with any of the size-named types, such as "uint_fast16_t",
and pass the appropriate format specifier to printf().

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: wevsr@nabs.net (Jeff Rife)
Date: Mon, 15 May 2006 20:47:53 GMT Raw View

 (kuyper@wizard.net) wrote in comp.std.c++:
> > > #include <iostream>
> > >
> > > int main()
> > > {
> > >     unsigned k;
> > >
> > >     std::cout << k;
> > > }
> >
> > Because it might print 7, it might print 2916, it might print...
>
> Code like this doesn't have to have undefined behavior.

Well, of course it doesn't *have* to have undefined behavior.  The C++
standard could have said something like "all objects not explicitly
initialized by code will be default initialized by the implementation".

But without wording like this, there is no way to know *what* is in the
bits of the memory occupied by k in the sample.  Thus, even if the
implementation were to just take the bits and treat them as a valid
unsigned int and output a text representation of that, what would be
output would be essentially random.  Since there is no way to know what
will be output, it's undefined behavior.

Even if an implementation goes beyond the standard and "defines" what
happens in this case, the code won't always result in the same output
for every implementation, and that's what makes it "undefined".

If the standard said something like "an implementation must define what
happens when an uninitialized object is accessed", then the behavior
would be defined but unspecified (since it is up to the implementation).
In that case, even though the resulting output can still be different
on different implementations, the behavior would not be undefined because
the standard would have defined it as "implementation defined".  It's
easy to see why the standard doesn't say this, since it would make it
very, very hard to write portable code.

--
Jeff Rife | "My God, what if the secret ingredient is people?"
          | "No, there's already a soda like that: Soylent Cola."
          | "Oh.  How is it?"
          | "It varies from person to person."
          |         -- Fry and Leela, "Futurama"

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Mon, 15 May 2006 23:37:34 CST Raw View

Jeff Rife wrote:
> (kuyper@wizard.net) wrote in comp.std.c++:
> > > > #include <iostream>
> > > >
> > > > int main()
> > > > {
> > > >     unsigned k;
> > > >
> > > >     std::cout << k;
> > > > }
> > >
> > > Because it might print 7, it might print 2916, it might print...
> >
> > Code like this doesn't have to have undefined behavior.
>
> Well, of course it doesn't *have* to have undefined behavior.  The C++
> standard could have said something like "all objects not explicitly
> initialized by code will be default initialized by the implementation".

That's not what I meant. I meant that for unsigned types, at least, the
behavior could be made unspecified, rather than undefined, with
negligible impact on virtually all existing implementations of C++,
just as it is for C99 for the size-named types such as int32_t.

> But without wording like this, there is no way to know *what* is in the
> bits of the memory occupied by k in the sample.  Thus, even if the
> implementation were to just take the bits and treat them as a valid
> unsigned int and output a text representation of that, what would be
> output would be essentially random.  Since there is no way to know what
> will be output, it's undefined behavior.

No. If that were the only issue, it could be unspecified behavior, not
undefined. Unspecified behavior means that there's a list of possible
choices (in this case, the list of valid values for unsigned int), and
an implementation can chose any one of them. Undefined behavior is much
more open-ended.

> Even if an implementation goes beyond the standard and "defines" what
> happens in this case, the code won't always result in the same output
> for every implementation, and that's what makes it "undefined".

No, behavior that is different for differenent runs is not in itself
enough reason to make something undefined. Unspecified and
implementation-defined behavior can also be time-dependent.

> If the standard said something like "an implementation must define what
> happens when an uninitialized object is accessed", then the behavior
> would be defined but unspecified (since it is up to the implementation).
> In that case, even though the resulting output can still be different
> on different implementations, the behavior would not be undefined because
> the standard would have defined it as "implementation defined".  It's
> easy to see why the standard doesn't say this, since it would make it
> very, very hard to write portable code.

There's no meaningful way to use the value of uninitialized variables
in portable code, with or without such a change. The only difference
making the behavior unspecified would have would be to render code
which uses the value in a meaningless fashion be just as portable as
code which doesn't use it at all. I'm not arguing that this is a good
thing, merely pointing out that the only difference such a change would
make is to make it easier to write portable code. For instance:

int main(void)
{
    unsigned char c;
    return c-c;
}

That program would have well-defined behavior. It's a pointlessly
complicated way of doing things, but it would be well-defined (In C99,
such code is strictly conforming, which imposes much stricter
requirements than merely "well-defined").

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: Tue, 16 May 2006 10:47:19 CST Raw View

In article <1147752080.707405.193860@v46g2000cwv.googlegroups.com>,
kuyper@wizard.net writes
>int main(void)
>{
>    unsigned char c;
>    return c-c;
>}
>
>That program would have well-defined behavior. It's a pointlessly
>complicated way of doing things, but it would be well-defined (In C99,
>such code is strictly conforming, which imposes much stricter
>requirements than merely "well-defined").


I am not convinced that it is strictly conforming. Are you claiming it
should always return 0? In reality it will because the compiler will
optimise away all the code and simply convert it to reporting a
successful execution. But in theory c is indeterminate and a response to
a DR in Sydney was that indeterminate actually means that the value is
permitted to change momentarily to any other value in the set of
permitted values for an unsigned char. Consider:

int main(void){
   unsigned char c[1000000];
   return c[0] - c[0];
}

On the abstract machine c[0] is read twice and so in practice could also
be (yes, no usable compiler does that). Between the first and second
read the process might be suspended and its memory paged out. However,
for efficiency reasons we should not require the process to save 1000000
bytes of indeterminate values, and we decided that we should not be
requiring that. On resumption c[0] might contain some other 'garbage'
value.

What I do think we need is some form of behaviour that allows, at worst,
that a process will be aborted but does not permit immediate
reformatting of your hard drive.

--
Francis Glassborow      ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: johnchx2@yahoo.com
Date: Tue, 16 May 2006 11:20:22 CST Raw View

kuyper@wizard.net wrote:

> There's no meaningful way to use the value of uninitialized variables
> in portable code, with or without such a change. The only difference
> making the behavior unspecified would have would be to render code
> which uses the value in a meaningless fashion be just as portable as
> code which doesn't use it at all.

Yes and no.  The change makes code "more portable" not by increasing
the number of architectures on which it will work as expected, but by
reducing the number of architectures for which a fully conforming C++
implementation is possible.

In other words, we make accessing uninitialized integers "portable" by
outlawing architectures with trapping integer representations.  I'm not
sure that this is a kind of "portability" that's worth having.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Tue, 16 May 2006 13:03:01 CST Raw View

johnchx2@yahoo.com wrote:
> kuyper@wizard.net wrote:
>
> > There's no meaningful way to use the value of uninitialized variables
> > in portable code, with or without such a change. The only difference
> > making the behavior unspecified would have would be to render code
> > which uses the value in a meaningless fashion be just as portable as
> > code which doesn't use it at all.
>
> Yes and no.  The change makes code "more portable" not by increasing
> the number of architectures on which it will work as expected, but by
> reducing the number of architectures for which a fully conforming C++
> implementation is possible.

Actually, I was responding to an comment that such a change would make
it harder to write portable code. I said it would actually be easier.
Whether a given change makes it harder or easier to write portable code
is a different issue from whether or not the change makes the code more
or less portable. Changes which makes things easier for the developer
often make things harder for the implementor, for precisely the reason
that you've noted.

> In other words, we make accessing uninitialized integers "portable" by
> outlawing architectures with trapping integer representations.  I'm not
> sure that this is a kind of "portability" that's worth having.

Insofar as I'm arguing for this to be allowed (which isn't much), I'm
arguing that C++ should match C99's requirements. Pretty much the only
types that C99 doesn't allow to trap are the character types and the
new size-named types such as int8_t, uint_fast16_t, int_least32_t, etc.
The exact-sized types are optional, and the mandatory fast/least-sized
types are allowed to have an actual size which is bigger than their
named size, which gives implementors a lot of freedom in figuring out
how to implement them. Are there many (any?) architectures where such a
requirement would be the only barrier to creating a fully conforming
implementation of C++? If there are, then those architectures will also
be unable to support a fully conforming implementation of C99.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: wevsr@nabs.net (Jeff Rife)
Date: Wed, 17 May 2006 06:04:20 GMT Raw View

 (kuyper@wizard.net) wrote in comp.std.c++:
> > Well, of course it doesn't *have* to have undefined behavior.  The C++
> > standard could have said something like "all objects not explicitly
> > initialized by code will be default initialized by the implementation".
>
> That's not what I meant. I meant that for unsigned types, at least, the
> behavior could be made unspecified, rather than undefined, with
> negligible impact on virtually all existing implementations of C++,

Of *course* it would have negligible impact.  Anything currently undefined
can't (by definition) be used in legal code.  Thus, changing the rules
to make undefined behavior in some way legal (either unspecified or truly
defined) obviously wouldn't impact any existing code.  It also obviously
wouldn't be a big deal for the compiler writer, as they wouldn't have to
change a thing to stay compliant, since I'm sure the standard would allow
an implementation to emit a diagnostic as one of the choices of the
unspecified behavior.

But, if a diagnostic isn't required, it would lead to a lot of poorly
written code that relies on implementation-specific behaviors.  I'm not
anywhere close to the C++ committee, but I don't think this is they way
they want to lean.

> just as it is for C99 for the size-named types such as int32_t.

ISTR these weren't defined by the standard before C99, so nobody was using
them.  As such, giving them a variance that other, older types didn't have
wouldn't break existing code.

--
Jeff Rife |
          | http://www.nabs.net/Cartoons/RhymesWithOrange/CatBed.jpg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Wed, 17 May 2006 01:05:47 CST Raw View

Francis Glassborow wrote:
> In article <1147752080.707405.193860@v46g2000cwv.googlegroups.com>,
> kuyper@wizard.net writes
> >int main(void)
> >{
> >    unsigned char c;
> >    return c-c;
> >}
> >
> >That program would have well-defined behavior. It's a pointlessly
> >complicated way of doing things, but it would be well-defined (In C99,
> >such code is strictly conforming, which imposes much stricter
> >requirements than merely "well-defined").
>
>
> I am not convinced that it is strictly conforming. Are you claiming it
> should always return 0? In reality it will because the compiler will
> optimise away all the code and simply convert it to reporting a
> successful execution. But in theory c is indeterminate and a response to
> a DR in Sydney was that indeterminate actually means that the value is
> permitted to change momentarily to any other value in the set of
> permitted values for an unsigned char.

I was unaware of that DR, but as I said, I'm not arguing that code like
this is a good idea, so I have no objection to that decision. I can't
think of any legitimate use for reading uninitialized memory (though I
know of a few illegitimate uses people have made of it).

.
> What I do think we need is some form of behaviour that allows, at worst,
> that a process will be aborted but does not permit immediate
> reformatting of your hard drive.

I agree - we need more distinctions between different levels of
conformance, and that strikes me as a reasonable level.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Tue, 9 May 2006 19:59:43 GMT Raw View

>From time to time, I've come across rules in the Standard which I find t=
o be=20
irrational (I hasten to use the word "stupid"). Has anyone else had such=20
thoughts about particular limitations set forth in the Standard? If so,=20
please post here about them, and, if possible, post some code supporting=20
your view. I'm not out-right condemning certain rules (as I realise I may=
=20
turn out to be wrong in the end), but rather I would like to discuss thei=
r=20
rationality (if any). I'll begin:


Irrational Rule 1:  With regards to unions, you can only read from the
------------------  member which you last wrote to.


Some nice code that flouts Irrational Rule 1:
---------------------------------------------

/* This code is for reversing an object's bytes */


#include <cstddef>

template<class T, std::size_t len>
std::size_t NumElem( const T (&array)[len] )
{
    return len;
}


#include <algorithm>

template<class T>
T ByteReversal(T i)
{
    union Transformer {
        T entire;
        unsigned char bytes[ sizeof(T) ];
    };

    Transformer &trfr =3D reinterpret_cast<Transformer &>(i);


    for( unsigned char *p_start =3D &trfr.bytes[0],
                       *p_end =3D &trfr.bytes[ NumElem(trfr.bytes) - 1 ];
         p_start !=3D p_end;
         ++p_start, --p_end )
    {

        std::swap( *p_start, *p_end );
=20
        /* Test if they're right beside each other */
        if ( 1 =3D=3D (p_end - p_start) ) break;

    }

    return trfr.entire;
}


#include <iostream>
#include <ostream>

int main()
{
    unsigned long i =3D 0x0A0B0C0D;

    std::cout << std::hex <<        i        << std::endl;

    std::cout << std::hex << ByteReversal(i) << std::endl;
}


Irrational Rule 2:  You can have a pointer to one past the end of
------------------  an array, but not if you make an l-value in the proce=
ss.


Demonstrative code for Irrational Rule 2:
---------------------------------------------


int main()
{
    int array[5];


    /* The following is okay */

    int *p1 =3D &array[0] + 5;


    /* But the following is NOT okay */

    int *p2 =3D &array[5];   =20


    /* I just don't see the logic! */
}


Big deal if you make an l-value in the process -- what harm is it?


-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: johnchx2@yahoo.com
Date: 9 May 2006 21:50:02 GMT Raw View

"Tom   s" wrote:

> Irrational Rule 1:  With regards to unions, you can only read from the
> ------------------  member which you last wrote to.

I've been told that the intent was to permit implementations which trap
when the "wrong" member of the union is read.  You can accomplish the
"type-punning" you want with pointer casts.

Your first example can be re-written as:

   #include <iostream>
   #include <algorithm>

   template <class T>
   T& ReverseInPlace( T& t )
   {
      char* begin = reinterpret_cast<char*>(&t);
      char* end = begin + sizeof(T);
      std::reverse( begin, end );
      return t;
   }

   int main()
   {
      unsigned long i = 0x0A0B0C0D;
      std::cout   << std::hex
                  << i
                  << std::endl
                  << ReverseInPlace( i )
                  << std::endl;
   }

I'm not sure I see what adding a union to the mix would add.



> Irrational Rule 2:  You can have a pointer to one past the end of
> ------------------  an array, but not if you make an l-value in the process.

This has been under discussion for years.  It looks like the rule will
be changed, but the committee is still polishing the exact language.
See:

  http://www2.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Aaron Graham" <atgraham@gmail.com>
Date: Tue, 9 May 2006 21:35:18 CST Raw View

> Some nice code that flouts Irrational Rule 1:

Nice code??  This code is terrible.  Not only will it go into an
infinite loop on odd-sized objects, the use of the union is entirely
superfluous.  The object byte reversal can be performed trivially with
already #included functionality:

  unsigned char* v = reinterpret_cast<unsigned char*>(&i);
  std::reverse(v, v + sizeof T);
  return i;

Besides, your usage of a union is not portable.  Section 9.5.1 says,
"The size of a union is sufficient to contain the largest of its data
members."  That means it can't be smaller than T.  But (at least from
what I can tell) it can be larger, and thus the "bytes" field isn't
required to coincide in memory.

Those issues (and more) aside, no, I don't think that type-safe unions
are irrational.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: cbarron413@adelphia.net (Carl Barron)
Date: Wed, 10 May 2006 06:41:47 GMT Raw View

In article <1147208310.959224.11150@j33g2000cwa.googlegroups.com>,
Aaron Graham <atgraham@gmail.com> wrote:

> Besides, your usage of a union is not portable.  Section 9.5.1 says,
> "The size of a union is sufficient to contain the largest of its data
> members."  That means it can't be smaller than T.  But (at least from
> what I can tell) it can be larger, and thus the "bytes" field isn't
> required to coincide in memory.
  9.5,1 [is 9.5.1 in c++98]
also says  this "... The size of a union is sufficient to contain the
largest of its data members. Each data member is allocated as if it
were the sole member of a struct."

  9.3.18 states: [is 9.2.17 in c++98]

A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial member (or if that
member is a bit-field, then to the unit in which it resides) and vice
versa. [ Note: There might therefore be unnamed
padding within a POD-struct object, but not at its beginning, as
necessary to achieve appropriate alignment. =8Bend
note ]
=20
conclusion the offset of each member of a union is zero.

so with appropriate casts the OP's code should work.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Wed, 10 May 2006 15:31:07 GMT Raw View

Aaron Graham posted:

>> Some nice code that flouts Irrational Rule 1:
>=20
> Nice code??  This code is terrible.  Not only will it go into an
> infinite loop on odd-sized objects, the use of the union is entirely
> superfluous.

I didn't know about "std::reverse", so I concede the superfluous point.

Infinite loop on odd-sized objects though? Let's try it out.


struct OddSizedType { char a[5]; };


Let's assume that an object of this structure is located at 0x0000, and t=
hus=20
that its last byte is located at 0x0004.

Let's take a look at the loop:

    for( unsigned char *p_start =3D &trfr.bytes[0],
                       *p_end =3D &trfr.bytes[ NumElem(trfr.bytes) - 1 ];
         p_start !=3D p_end;
         ++p_start, --p_end )
    {

        std::swap( *p_start, *p_end );
=20
        /* Test if they're right beside each other */
        if ( 1 =3D=3D (p_end - p_start) ) break;

    }

    return trfr.entire;


First iteration:
----------------

"p_start" starts off a 0x0000.
"p_end" starts off a 0x0004.
These addresses are compared and found to be not equal.
The loop body is entered.
The byte at 0x0000 is swaped with the byte at 0x0004.
The distance between the bytes is determined not to be 1.
"p_start" is incremented to 0x0001.
"p_end" is decremented to 0x0003.

Second iteration:
----------------

"p_start" starts off a 0x0001.
"p_end" starts off a 0x0003.
These addresses are compared and found to be not equal.
The loop body is entered.
The byte at 0x0001 is swaped with the byte at 0x0003.
The distance between the bytes is determined not to be 1.
"p_start" is incremented to 0x0002.
"p_end" is decremented to 0x0002.


Third iteration:
----------------

"p_start" starts off a 0x0002.
"p_end" starts off a 0x0002.
These addresses are compared and found to be equal, thus terminating the=20
loop.



If I've missed something, please tell me.


> The object byte reversal can be performed trivially with
> already #included functionality:
>=20
>   unsigned char* v =3D reinterpret_cast<unsigned char*>(&i);
>   std::reverse(v, v + sizeof T);
>   return i;
>=20
> Besides, your usage of a union is not portable.  Section 9.5.1 says,
> "The size of a union is sufficient to contain the largest of its data
> members."  That means it can't be smaller than T.  But (at least from
> what I can tell) it can be larger, and thus the "bytes" field isn't
> required to coincide in memory.


See Carl Bannon's post in the thread.


-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Tom s" <NULL@NULL.NULL>
Date: Wed, 10 May 2006 11:27:40 CST Raw View

"Tom   s" posted:

> Aaron Graham posted:
>
>>> Some nice code that flouts Irrational Rule 1:
>>
>> Nice code??  This code is terrible.  Not only will it go into an
>> infinite loop on odd-sized objects, the use of the union is entirely
>> superfluous.
>
> I didn't know about "std::reverse", so I concede the superfluous point.
>
> Infinite loop on odd-sized objects though? Let's try it out.
>
>
> struct OddSizedType { char a[5]; };
>
>
> Let's assume that an object of this structure is located at 0x0000, and
> thus that its last byte is located at 0x0004.
>
> Let's take a look at the loop:
>
>     for( unsigned char *p_start = &trfr.bytes[0],
>                        *p_end = &trfr.bytes[ NumElem(trfr.bytes) - 1 ];
>          p_start != p_end;
>          ++p_start, --p_end )
>     {
>
>         std::swap( *p_start, *p_end );
>
>         /* Test if they're right beside each other */
>         if ( 1 == (p_end - p_start) ) break;
>
>     }
>
>     return trfr.entire;
>
>
> First iteration:
> ----------------
>
> "p_start" starts off a 0x0000.
> "p_end" starts off a 0x0004.
> These addresses are compared and found to be not equal.
> The loop body is entered.
> The byte at 0x0000 is swaped with the byte at 0x0004.
> The distance between the bytes is determined not to be 1.
> "p_start" is incremented to 0x0001.
> "p_end" is decremented to 0x0003.
>
> Second iteration:
> ----------------
>
> "p_start" starts off a 0x0001.
> "p_end" starts off a 0x0003.
> These addresses are compared and found to be not equal.
> The loop body is entered.
> The byte at 0x0001 is swaped with the byte at 0x0003.
> The distance between the bytes is determined not to be 1.
> "p_start" is incremented to 0x0002.
> "p_end" is decremented to 0x0002.
>
>
> Third iteration:
> ----------------
>
> "p_start" starts off a 0x0002.
> "p_end" starts off a 0x0002.
> These addresses are compared and found to be equal, thus terminating
> the loop.
>
>
>
> If I've missed something, please tell me.
>
>
>> The object byte reversal can be performed trivially with
>> already #included functionality:
>>
>>   unsigned char* v = reinterpret_cast<unsigned char*>(&i);
>>   std::reverse(v, v + sizeof T);
>>   return i;
>>
>> Besides, your usage of a union is not portable.  Section 9.5.1 says,
>> "The size of a union is sufficient to contain the largest of its data
>> members."  That means it can't be smaller than T.  But (at least from
>> what I can tell) it can be larger, and thus the "bytes" field isn't
>> required to coincide in memory.
>

> See Carl Bannon's post in the thread.

Wups, sorry for the name mangling. Should be Carl Barron.


-Tom   s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "ThosRTanner" <ttanner2@bloomberg.net>
Date: Wed, 10 May 2006 11:48:16 CST Raw View

"Tom   s" wrote:
> >From time to time, I've come across rules in the Standard which I find to be
> irrational (I hasten to use the word "stupid"). Has anyone else had such
> thoughts about particular limitations set forth in the Standard? If so,
> please post here about them, and, if possible, post some code supporting
> your view.

I dislike the rule which specifically stops you creating types inside
anonymous unions, so that this:

union
{
    struct
    {
       unsigned field1 : 5;
       unsigned field2 : 10;
       unsigned field3 : 17;
    } data;
    unsigned all_bits;
} some_useful_name;

It's an extremely useful construct when dealing with both hardware
registers and bit-packed data, and it seems pointless that it is
disallowed. I realise you can code round it by defining the struct
outside the union, but that contradicts the good programming practice
of not giving any name scope larger than it needs (which in this case
is only needed inside the union).

All 3 compilers (sun, ibm, gcc) we use permit this construct and
generate a warning saying it is non portable because it is disallowed
by the standard.

I can't see that it makes the compilers work more difficult (I'm pretty
sure it's perfectly legal in C), and I can't see how it might be
ambiguous. (Even if you actually create names for your types within the
union - that you can only access them from within the declaration, not
from the code - that isn't a problem).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Greg Herlihy" <greghe@pacbell.net>
Date: 10 May 2006 18:20:02 GMT Raw View

ThosRTanner wrote:
> "Tom   s" wrote:
> > >From time to time, I've come across rules in the Standard which I find to be
> > irrational (I hasten to use the word "stupid"). Has anyone else had such
> > thoughts about particular limitations set forth in the Standard? If so,
> > please post here about them, and, if possible, post some code supporting
> > your view.
>
> I dislike the rule which specifically stops you creating types inside
> anonymous unions, so that this:
>
> union
> {
>     struct
>     {
>        unsigned field1 : 5;
>        unsigned field2 : 10;
>        unsigned field3 : 17;
>     } data;
>     unsigned all_bits;
> } some_useful_name;

But this union is not anonymous. If it were anonymous then there would
be no problem with the interior struct:

    int main( )
    {
        union
        {
            struct
            {
                unsigned field1 : 5;
                unsigned field2 : 10;
                unsigned field3 : 17;
            } data;
            unsigned all_bits;
        };

        all_bits = 5; // OK
        data.field1 = 3; // OK

Note that this anonymous union defines - but does not declare - a
nested type. Therefore this union does not violate the prohibition
against nested type declarations. And in fact this union no longer
generates a warning when compiled with gcc.

Alternately, if the union has to be named, then the straightforward
solution would be to simply name its type as well:

        union u
        {
            struct
            {
                unsigned field1 : 5;
                unsigned field2 : 10;
                unsigned field3 : 17;
            } data;
            unsigned all_bits;
        } some_useful_name;

After all, since the union's type name is never used anyway, what is
the benefit in leaving the union's type without a name? Especially
since omitting the name (and providing an object name) ensures that the
interior struct definition is no longer legal.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Martin Bonner" <martinfrompi@yahoo.co.uk>
Date: Wed, 10 May 2006 13:21:43 CST Raw View

"Tom   s" wrote:
>From time to time, I've come across rules in the Standard which I find to be
> irrational (I hasten to use the word "stupid").
I think you mean "hesitate" rather than "hasten" :-)

My pet peeve is the rule that you can't declare a template parameter to
be a friend of a template class.  Thus
 template <class T> struct MyCleverTemplate { friend class T; }
is not (currently) legal.

I always found the argument that this allowed Machiavelli to subvert
the internals of MyCleverTemplate unconvincing - we'd just fire him.

I believe there may be plans to relax this particular rule in the next
version of the standard.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: johnchx2@yahoo.com
Date: 10 May 2006 22:00:01 GMT Raw View

Greg Herlihy wrote:

> Note that this anonymous union defines - but does not declare - a
> nested type. Therefore this union does not violate the prohibition
> against nested type declarations.

But the rule (9.5/2) actually says: "The member-specification of an
anonymous union shall only define non-static data members."  I don't
see how this would permit the definition of a type.

Moreover, since the member-specification is defined by the grammer as
basically a   series of member-declarations, I'm not sure it's correct
to say that this doesn't declare the type.  (The notion of a
declaration that doesn't introduce a name into a scope is rather
strange, but not unheard of: the standard speaks of the declaration of
an unnamed bit field in 9.6/2.)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: clarkcox3@gmail.com ("Clark S. Cox III")
Date: Thu, 11 May 2006 14:45:05 GMT Raw View

On 2006-05-09 15:59:43 -0400, NULL@NULL.NULL ("Tom=E1s") said:

>=20
>> From time to time, I've come across rules in the Standard which I find=
 t
> o be irrational (I hasten to use the word "stupid"). Has anyone else=20
> had such thoughts about particular limitations set forth in the=20
> Standard? If so, please post here about them, and, if possible, post=20
> some code supporting your view. I'm not out-right condemning certain=20
> rules (as I realise I may  turn out to be wrong in the end), but rather=
=20
> I would like to discuss thei r rationality (if any). I'll begin:

[snip]

> Irrational Rule 2:  You can have a pointer to one past the end of
> ------------------  an array, but not if you make an l-value in the pro=
ce ss.
>=20
>=20
> Demonstrative code for Irrational Rule 2:
> ---------------------------------------------
>=20
> int main()
> {
>     int array[5];
>=20
>     /* The following is okay */
>     int *p1 =3D &array[0] + 5;
>=20
>     /* But the following is NOT okay */
>     int *p2 =3D &array[5];
>=20
>     /* I just don't see the logic! */
> }
>=20
> Big deal if you make an l-value in the process -- what harm is it?

Because you're dereferencing a pointer that doesn't point to allocated=20
memory. In the absence of overloaded operators, the following two=20
expressions are, by definition, equivalent: '&a[b]', '&*(a+b)'. Note=20
the dereferencing done by the '*' operator. If the expression "*(a+b)"=20
is undefined behavior, then so is the expression "a[b]".


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Thu, 11 May 2006 15:53:57 GMT Raw View

Let's say we have a 16-Bit unsigned integer.

It can store 65 536 unique values.

It can store from 0 to 65535 inclusive.

It has no "invalid" bit patterns -- each of them is a valid number.

So why does the following code exhibit Undefined Behaviour?:

#include <iostream>

int main()
{
    unsigned k;

    std::cout << k;
}


-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Thu, 11 May 2006 16:03:20 GMT Raw View

"Clark S. Cox III" posted:

> Because you're dereferencing a pointer that doesn't point to allocated=20
> memory. In the absence of overloaded operators, the following two=20
> expressions are, by definition, equivalent: '&a[b]', '&*(a+b)'. Note=20
> the dereferencing done by the '*' operator. If the expression "*(a+b)"=20
> is undefined behavior, then so is the expression "a[b]".

Yes, but my point is that the following code should be harmless:

int main()
{
    int *p =3D 0;

    *p;


    int *k =3D reinterpret_cast<int*>(555);

    *k;


    int *j;

    *j;
}

In the above code, I have dereferenced three pointer variables which poin=
t=20
to invalid locations. However, no memory is accessed -- nothing is read=20
from, and nothing is written to.

What could be the rationale behind them exhibiting Undefined Behaviour?

-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Thu, 11 May 2006 11:38:51 CST Raw View

"Clark S. Cox III" wrote:
> On 2006-05-09 15:59:43 -0400, NULL@NULL.NULL ("Tom   s") said:
..
> > Irrational Rule 2:  You can have a pointer to one past the end of
> > ------------------  an array, but not if you make an l-value in the proce ss.
> >
> >
> > Demonstrative code for Irrational Rule 2:
> > ---------------------------------------------
> >
> > int main()
> > {
> >     int array[5];
> >
> >     /* The following is okay */
> >     int *p1 = &array[0] + 5;
> >
> >     /* But the following is NOT okay */
> >     int *p2 = &array[5];
> >
> >     /* I just don't see the logic! */
> > }
> >
> > Big deal if you make an l-value in the process -- what harm is it?
>
> Because you're dereferencing a pointer that doesn't point to allocated
> memory. In the absence of overloaded operators, the following two
> expressions are, by definition, equivalent: '&a[b]', '&*(a+b)'. Note
> the dereferencing done by the '*' operator. If the expression "*(a+b)"
> is undefined behavior, then so is the expression "a[b]".

The C99 standard specifies for the & operator that:

"If the operand is the result of a unary * operator, neither that
operator nor the & operator is evaluated and the result is as if both
were omitted, except that the constraints on the operators still apply
and the result is not an lvalue."

Therefore, &a[b] is the same as &*(a+b), which in turn is the same as
a+b - the pointer is never actually derefenced. The C committee made
this decision based upon the fact that &a[b] is a common idiom, and few
(if any) compilers actually had any problem with it: they would in fact
generate exactly the same machine code for &a[b] as for a+b.

Is there any reason why C++ couldn't adopt the same rule, at least when
operator overloads are not involved?


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Jeff Rife <wevsr@nabs.net>
Date: Thu, 11 May 2006 12:08:45 CST Raw View

Tom   s (NULL@NULL.NULL) wrote in comp.std.c++:
> Let's say we have a 16-Bit unsigned integer.

OK.

> It can store 65 536 unique values.
>
> It can store from 0 to 65535 inclusive.
>
> It has no "invalid" bit patterns -- each of them is a valid number.

These are implementation-dependant assumptions.  It's perfectly
legal for an implentation to reserve 0xffff (for example) in a 16-bit
value as the "uninitialized memory flag".  Likely, this would only be
done if the hardware supported it, but it's legal.

You have to use <limits> to find out exactly what the min/max "unsigned
int" (assuming that "unsigned int" is 16-bit...if not, whatever standard
type is 16-bits) values are.

--
Jeff Rife |
          | http://www.nabs.net/Cartoons/OverTheHedge/Macarena.gif

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Thu, 11 May 2006 17:57:25 GMT Raw View

Jeff Rife posted:

>> It can store from 0 to 65535 inclusive.
>>=20
>> It has no "invalid" bit patterns -- each of them is a valid number.
>=20
> These are implementation-dependant assumptions.  It's perfectly
> legal for an implentation to reserve 0xffff (for example) in a 16-bit
> value as the "uninitialized memory flag".  Likely, this would only be
> done if the hardware supported it, but it's legal.
>=20
> You have to use <limits> to find out exactly what the min/max "unsigned
> int" (assuming that "unsigned int" is 16-bit...if not, whatever standar=
d
> type is 16-bits) values are.


Is there not a requirement whereby an X-Bit unsigned integral type can:

a) Hold 2^X unique values

b) Store values 0 to ( 2^X - 1 ) inclusive

?


If so, then it's not possible to have an "invalid" value.


-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: wevsr@nabs.net (Jeff Rife)
Date: Thu, 11 May 2006 20:34:50 GMT Raw View

Tom=E1s (NULL@NULL.NULL) wrote in comp.std.c++:
> > You have to use <limits> to find out exactly what the min/max "unsign=
ed
> > int" (assuming that "unsigned int" is 16-bit...if not, whatever stand=
ard
> > type is 16-bits) values are.
>=20
>=20
> Is there not a requirement whereby an X-Bit unsigned integral type can:
>=20
> a) Hold 2^X unique values
>=20
> b) Store values 0 to ( 2^X - 1 ) inclusive

For "signed char" and "unsigned char" (and thus, "char"), there appears
to be this requirement based on the wording in 3.9.1.1, but "these
requirements do not hold for other types".  There are no changes to this
section in the draft concerning this wording, so I'd say it's not gonna
change.

--=20
Jeff Rife | =20
          | "He chose...poorly."=20
          | =20
          |    -- Grail Knight, "Indiana Jones and the Last Crusade"=20

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]