Thread

Topic: assignment to enumerated types of invalid integer values

Author: sarima@ix.netcom.com (Stanley Friesen)
Date: 1997/06/06 Raw View

d96-mst@nada.kth.se (Mikael Steldal) wrote:

>In article <199705302325.QAA17750@proxy4.ba.best.com>,
>bill@gibbons.org (Bill Gibbons) wrote:
>
>>This is implementation-defined behavior, because an implementation
>>might chose to represent:
>>
>>    enum E { e };
>
>What value will e have? Is it garantueed to be 0?

Yes.

The peace of God be with you.

Stanley Friesen
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/29 Raw View

bill@gibbons.org (Bill Gibbons) writes:

Thank's for the answer.  Now two further questions:

|>  In article <rf5u3jqa56x.fsf@vx.cit.alcatel.fr>, James Kanze
|>  <james-albert.kanze@vx.cit.alcatel.fr> wrote:
|>
|>  > Steve Clamage <stephen.clamage@eng.sun.com> writes:
|>  >
|>  >  |>
|>  >  |> B. K. Oxley (binkley) wrote:
|>  >  |> >
|>  >  |> > Here is example code of my point of interest:
|>  >  |> >
|>  >  |> >   enum Foo {qux, quux};
|>  >  |> >   cout << Foo (0) << endl; // Ok.
|>  >  |> >   cout << Foo (1) << endl; // Also ok.
|>  >  |> >   cout << Foo (2) << endl; // Not ok.
|>  >  |> >   cout << Foo (-1) << endl; // Also not ok.
|>  >  |> >   cout << Foo (10000000) << endl; // Also not ok.
|>  >  |> >
|>  >  |> > I scanned the Dec '96 draft standard [decl.enum], and found
|>  >  |> > no example disallowing the "not ok" statements.
|>  >
|>  >     [...]
|>  >  |> If on your system type int has 32 bits, and the implementation
|>  >  |> chose to use int as the underlying type for Foo, all the values
|>  >  |> you use above can be represented. If the implementation chose
|>  >  |> to use an 8 or 16 bit integer type for Foo, the value 10000000
|>  >  |> would get truncated. If it chose to use an unsigned integer
|>  >  |> type (because Foo has no negative values), the value -1 would
|>  >  |> be converted to a large value of that type.
|>  >
|>  > Is the behavior on overflow required, or just typical?  Would an
|>  > implementation that converted Foo( 100000000 ) to 0 be legal, for
|>  > example?  Or one that generated a trap or an exception?
|>
|>  Clause 5 paragraph 5:
|>
|>    If during the evaluation of an expression, the result is not
|>    mathematically defined or not in the range of representable
|>    values for its type, the behavior is undefined, unless such an
|>    expression is a constant expression (5.19), in which case the
|>    program is ill-formed. [Note: most existing implementations of
|>    C++ ignore integer overflows. Treatment of division by zero,
|>    forming a remainder using a zero divisor, and all floating point
|>    exceptions vary among machines, and is usually adjustable by a
|>    library function. ]

This is what I thought with regards to enums (and what I would want).

Does the above paragraph also apply to unsigned values?  This would
be a change with regards to C (and certainly break some programs).

|>  Clause 7 section 7.2 paragraphs 5 & 6:
|>
|>    The underlying type of an enumeration is an integral type that
|>    can represent all the enumerator values defined in the enumeration.
|>    It is implementation-defined which integral type is used as the
|>    underlying type for an enumeration except that the underlying type
|>    shall not be larger than int unless the value of an enumerator
|>    cannot fit in an int or unsigned int. If the enumerator-list is
|>    empty, the underlying type is as if the enumeration had a single
|>    enumerator with value 0. The value of sizeof() applied to an
|>    enumeration type, an object of enumeration type, or an enumerator,
|>    is the value of sizeof() applied to the underlying type.

Am I correct in assuming that this is not meant to allow an empty enum
(or one with a single constant of 0) to have sizeof == 0?

|>    For an enumeration where emin is the smallest enumerator and emax
|>    is the largest, the values of the enumeration are the values of
|>    the underlying type in the range bmin to bmax , where bmin and bmax
|>    are, respectively, the smallest and largest values of the smallest
|>    bit-field that can store emin and emax. It is possible to
|>    define an enumeration that has values not defined by any of its
|>    enumerators.
|>
|>  So the "not ok" cases have undefined behavior, which in this case
|>  will typically be that everything works as long as the values do
|>  not exceed those representable in the type the underlying type.

Here is really where I am unsure.  (I am supposing that unsigned
arithmetic is defined as in C.)  Arithmetic on enums is in fact
arithmetic on the underlying type.  So whether this is defined or not
will depend on whether the underlying type is signed or not.  Or is
there something else that I've missed?

|>  But an implementation could truncate, generate a hardware
|>  exception, set the result to zero, ar anything else and still
|>  be conforming.  That is, "undefined behavior".
|>
|>  The two most common uses of these rules are:
|>
|>    * An integral subrange type:
|>
|>          enum MyType { MyTypeMin = -10, MyTypeMax = 10 };
|>
|>    * A small bit vector, i.e. a word of flag bits:
|>
|>          enum MyIOFlags { read=1, write=2, append=4, binary=8 };
|>
|>  For the subrange, the intermediate values are guaranteed to work.
|>
|>  For the flags, any and/or/not etc. bit manipulation of the flags
|>  is guaranteed to work.
|>
|>  Of course an explicit cast back to the enumeration type is still
|>  needed after any arithmetic.

Unless, of course, you've overloaded the necessary operators on the
enum.  (IMHO, if the enum is meant to be used in one of these ways, it
is good programming practice to overload the operators, rather than have
the user code full of casts.)

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: bill@gibbons.org (Bill Gibbons)
Date: 1997/05/30 Raw View

In article <rf5k9ki7l0b.fsf@vx.cit.alcatel.fr>, James Kanze
<james-albert.kanze@vx.cit.alcatel.fr> wrote:

> ... quotes from standard about undefined behavior when assigning
> ... out-of-range values to enumeration variables... omitted ...
>
> This is what I thought with regards to enums (and what I would want).
>
> Does the above paragraph also apply to unsigned values?  This would
> be a change with regards to C (and certainly break some programs).

Yes, it does apply.  The handling of out-of-range enumeration values is
quite different in C++ than it was in C, because a C++ compiler may
represent the enumeration with a type smaller than "int".  On the other
hand, C does not handle unsigned values well using enumeration types.

> |>  Clause 7 section 7.2 paragraphs 5 & 6:
> |>
> |>    The underlying type of an enumeration is an integral type that
> |>    can represent all the enumerator values defined in the enumeration.
> |>    It is implementation-defined which integral type is used as the
> |>    underlying type for an enumeration except that the underlying type
> |>    shall not be larger than int unless the value of an enumerator
> |>    cannot fit in an int or unsigned int. If the enumerator-list is
> |>    empty, the underlying type is as if the enumeration had a single
> |>    enumerator with value 0. The value of sizeof() applied to an
> |>    enumeration type, an object of enumeration type, or an enumerator,
> |>    is the value of sizeof() applied to the underlying type.
>
> Am I correct in assuming that this is not meant to allow an empty enum
> (or one with a single constant of 0) to have sizeof == 0?

Right.

> |>    For an enumeration where emin is the smallest enumerator and emax
> |>    is the largest, the values of the enumeration are the values of
> |>    the underlying type in the range bmin to bmax , where bmin and bmax
> |>    are, respectively, the smallest and largest values of the smallest
> |>    bit-field that can store emin and emax. It is possible to
> |>    define an enumeration that has values not defined by any of its
> |>    enumerators.
> |>
> |>  So the "not ok" cases have undefined behavior, which in this case
> |>  will typically be that everything works as long as the values do
> |>  not exceed those representable in the type the underlying type.
>
> Here is really where I am unsure.  (I am supposing that unsigned
> arithmetic is defined as in C.)  Arithmetic on enums is in fact
> arithmetic on the underlying type.  So whether this is defined or not
> will depend on whether the underlying type is signed or not.  Or is
> there something else that I've missed?

This is implementation-defined behavior, because an implementation
might chose to represent:

    enum E { e };

with any integral type except long and unsigned long.  The draft
says that

    ... the underlying type shall not be larger than int unless the value
    of an enumerator cannot fit in an int or unsigned int.

It should probably also say:

    The underlying type shall not be "unsigned int" or a type which
    promotes to "unsigned int" unless the value of an enumerator cannot
    fit in an int.


(This wording is slightly more complex than one would expect because
"unsigned short" promotes to "unsigned int" if short and int are the
same size, and the implementation is free to use "unsigned short" as
an underlying type.)

This would ensure that the following code has portable semantics:


    void f(int);
    void f(unsigned int);
    enum E { e };
    void g() {
        f( e + 1 );  // implementation-defined which one is called
    }


I will raise this issue with the standards committee.

-- Bill Gibbons
   bill@gibbons.org
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: d96-mst@nada.kth.se (Mikael Steldal)
Date: 1997/06/02 Raw View

In article <199705302325.QAA17750@proxy4.ba.best.com>,
bill@gibbons.org (Bill Gibbons) wrote:

>This is implementation-defined behavior, because an implementation
>might chose to represent:
>
>    enum E { e };

What value will e have? Is it garantueed to be 0?

>It should probably also say:
>
>    The underlying type shall not be "unsigned int" or a type which
>    promotes to "unsigned int" unless the value of an enumerator cannot
>    fit in an int.

The sign is non-usable in this context, isn't it? Why not require the
underlaying type to always be unsigned?
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/26 Raw View

Steve Clamage <stephen.clamage@eng.sun.com> writes:

 |>
 |> B. K. Oxley (binkley) wrote:
 |> >
 |> > Here is example code of my point of interest:
 |> >
 |> >   enum Foo {qux, quux};
 |> >   cout << Foo (0) << endl; // Ok.
 |> >   cout << Foo (1) << endl; // Also ok.
 |> >   cout << Foo (2) << endl; // Not ok.
 |> >   cout << Foo (-1) << endl; // Also not ok.
 |> >   cout << Foo (10000000) << endl; // Also not ok.
 |> >
 |> > I scanned the Dec '96 draft standard [decl.enum], and found
 |> > no example disallowing the "not ok" statements.

    [...]
 |> If on your system type int has 32 bits, and the implementation
 |> chose to use int as the underlying type for Foo, all the values
 |> you use above can be represented. If the implementation chose
 |> to use an 8 or 16 bit integer type for Foo, the value 10000000
 |> would get truncated. If it chose to use an unsigned integer
 |> type (because Foo has no negative values), the value -1 would
 |> be converted to a large value of that type.

Is the behavior on overflow required, or just typical?  Would an
implementation that converted Foo( 100000000 ) to 0 be legal, for
example?  Or one that generated a trap or an exception?

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: bill@gibbons.org (Bill Gibbons)
Date: 1997/05/28 Raw View

In article <rf5u3jqa56x.fsf@vx.cit.alcatel.fr>, James Kanze
<james-albert.kanze@vx.cit.alcatel.fr> wrote:

> Steve Clamage <stephen.clamage@eng.sun.com> writes:
>
>  |>
>  |> B. K. Oxley (binkley) wrote:
>  |> >
>  |> > Here is example code of my point of interest:
>  |> >
>  |> >   enum Foo {qux, quux};
>  |> >   cout << Foo (0) << endl; // Ok.
>  |> >   cout << Foo (1) << endl; // Also ok.
>  |> >   cout << Foo (2) << endl; // Not ok.
>  |> >   cout << Foo (-1) << endl; // Also not ok.
>  |> >   cout << Foo (10000000) << endl; // Also not ok.
>  |> >
>  |> > I scanned the Dec '96 draft standard [decl.enum], and found
>  |> > no example disallowing the "not ok" statements.
>
>     [...]
>  |> If on your system type int has 32 bits, and the implementation
>  |> chose to use int as the underlying type for Foo, all the values
>  |> you use above can be represented. If the implementation chose
>  |> to use an 8 or 16 bit integer type for Foo, the value 10000000
>  |> would get truncated. If it chose to use an unsigned integer
>  |> type (because Foo has no negative values), the value -1 would
>  |> be converted to a large value of that type.
>
> Is the behavior on overflow required, or just typical?  Would an
> implementation that converted Foo( 100000000 ) to 0 be legal, for
> example?  Or one that generated a trap or an exception?

Clause 5 paragraph 5:

  If during the evaluation of an expression, the result is not
  mathematically defined or not in the range of representable
  values for its type, the behavior is undefined, unless such an
  expression is a constant expression (5.19), in which case the
  program is ill-formed. [Note: most existing implementations of
  C++ ignore integer overflows. Treatment of division by zero,
  forming a remainder using a zero divisor, and all floating point
  exceptions vary among machines, and is usually adjustable by a
  library function. ]

Clause 7 section 7.2 paragraphs 5 & 6:

  The underlying type of an enumeration is an integral type that
  can represent all the enumerator values defined in the enumeration.
  It is implementation-defined which integral type is used as the
  underlying type for an enumeration except that the underlying type
  shall not be larger than int unless the value of an enumerator
  cannot fit in an int or unsigned int. If the enumerator-list is
  empty, the underlying type is as if the enumeration had a single
  enumerator with value 0. The value of sizeof() applied to an
  enumeration type, an object of enumeration type, or an enumerator,
  is the value of sizeof() applied to the underlying type.

  For an enumeration where emin is the smallest enumerator and emax
  is the largest, the values of the enumeration are the values of
  the underlying type in the range bmin to bmax , where bmin and bmax
  are, respectively, the smallest and largest values of the smallest
  bit-field that can store emin and emax. It is possible to
  define an enumeration that has values not defined by any of its
  enumerators.

So the "not ok" cases have undefined behavior, which in this case
will typically be that everything works as long as the values do
not exceed those representable in the type the underlying type.

But an implementation could truncate, generate a hardware
exception, set the result to zero, ar anything else and still
be conforming.  That is, "undefined behavior".

The two most common uses of these rules are:

  * An integral subrange type:

        enum MyType { MyTypeMin = -10, MyTypeMax = 10 };

  * A small bit vector, i.e. a word of flag bits:

        enum MyIOFlags { read=1, write=2, append=4, binary=8 };

For the subrange, the intermediate values are guaranteed to work.

For the flags, any and/or/not etc. bit manipulation of the flags
is guaranteed to work.

Of course an explicit cast back to the enumeration type is still
needed after any arithmetic.

-- Bill Gibbons
   bill@gibbons.org
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "B. K. Oxley (binkley)" <binkley@pcdocs.com>
Date: 1997/05/23 Raw View

Here is example code of my point of interest:

  enum Foo {qux, quux};
  cout << Foo (0) << endl; // Ok.
  cout << Foo (1) << endl; // Also ok.
  cout << Foo (2) << endl; // Not ok.
  cout << Foo (-1) << endl; // Also not ok.
  cout << Foo (10000000) << endl; // Also not ok.

I scanned the Dec '96 draft standard [decl.enum], and found
no example disallowing the "not ok" statements.  I found an example in
[Stroustrup, 2nd Ed., p. 70] barring assignment
of an invalid integer value:

  enum keyword {ASM, AUTO, BREAK};
  keyword k = ASM;
  ...
  k = 4; // error

but no example like mine:

  k = keyword (4);

My compiler (MSVC++ 5.0) didn't even give a warning, and the
program fragment produces reasonable output:

  0
  1
  2
  -1
  10000000

under the circumstances.  Is this an error?  Where in the
standard is this discussed?

Thanks,
--binkley
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Steve Clamage <stephen.clamage@eng.sun.com>
Date: 1997/05/24 Raw View

B. K. Oxley (binkley) wrote:
>
> Here is example code of my point of interest:
>
>   enum Foo {qux, quux};
>   cout << Foo (0) << endl; // Ok.
>   cout << Foo (1) << endl; // Also ok.
>   cout << Foo (2) << endl; // Not ok.
>   cout << Foo (-1) << endl; // Also not ok.
>   cout << Foo (10000000) << endl; // Also not ok.
>
> I scanned the Dec '96 draft standard [decl.enum], and found
> no example disallowing the "not ok" statements.

The rule has changed a bit over time, so reference books and
compilers may give different answers until they all catch
up to the new rules.

The current rule (see 7.2 "Enumeration declarations") is
that enums must be implemented as an integer type having
enough bits to represent all of the defined enumerators.

Any value that can be represented in that number of bits
is valid for that enum type. Using values outside that
range has unspecified results. "Unspecified" means the
implementation is not required to tell you what will happen
if you try to use out-of-range values.

In your example, enum Foo has enumerators with values 0 and
1. It takes 1 bit to represent those values, and no other
values can reliably be represented as type Foo. You can
convert other values to type Foo, but you cannot predict the
results.

If on your system type int has 32 bits, and the implementation
chose to use int as the underlying type for Foo, all the values
you use above can be represented. If the implementation chose
to use an 8 or 16 bit integer type for Foo, the value 10000000
would get truncated. If it chose to use an unsigned integer
type (because Foo has no negative values), the value -1 would
be converted to a large value of that type.

--
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com>
Date: 1997/05/24 Raw View

Steve Clamage wrote:

> The current rule (see 7.2 "Enumeration declarations") is
> that enums must be implemented as an integer type having
> enough bits to represent all of the defined enumerators.
>
> Any value that can be represented in that number of bits
> is valid for that enum type. Using values outside that
> range has unspecified results. "Unspecified" means the
> implementation is not required to tell you what will happen
> if you try to use out-of-range values.

I assume the rule is written this way to allow enumerators to be ORed
together, right?

--

Ciao,
Paul

(Please remove the "strip_these_words." prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]