Topic: Strange type mapping for integer literals


Author: ron@sensor.com ("Ron Natalie")
Date: Wed, 12 Mar 2003 20:50:49 +0000 (UTC)
Raw View
""Stefan Slapeta"" <stefan@slapeta.com> wrote in message news:3e6cf026$0$37376$91cee783@newsreader01.highway.telekom.at...

> 1)
> Why is there a different behaviour for decimal and non-decimal integer
> literals?

Because the hex and octal numbers were more frequently used for unsigned in the early
days before they came up with the idea of the U suffix.

> 2)
> I think that the types for decimal integer literals are not sufficient.

Why?  You want smaller ones?

> The problem here is, that you cannot provide an operator + for the given
> literal that is ok for every compiler because the standard says that the
> 'behaviour is undefined' in this case (the implementations I know give this
> literal the type 'unsigned long int').

You can't provide an operator+ even if you solve the literal problem.   If the
value you want is greater than the maximum long int value, what are you going
to represent it as.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: algrant@myrealbox.com (Al Grant)
Date: Wed, 12 Mar 2003 20:52:36 +0000 (UTC)
Raw View
kanze@gabi-soft.de (James Kanze) wrote in message news:<d6651fb6.0303110646.519cb8ae@posting.google.com>...
> If you require
> unsgigned long, however, you run the risk of accidentally forcing an
> expression to be evaluated as unsigned when it isn't wanted:
>
>     long i = -1 ;
>     if ( i < 3000000000 ) ...
>
> doesn't work as expected when I compile it in 32 bit mode.

That's really an unfortunate result of the promotion rules and
the lack of built-in operations between signed and unsigned types.
Comparison of a value of a signed type with a value of an unsigned
type is well-defined; why is it that not only does the language not
support it, it actually defines an attempt to write it as meaning
something entirely different?

It's all very well to argue (as you appear to be doing) for
breaking legacy compatibility in the interests of safety - but
it would be better if you did not have to use a piece of far more
gross legacy behaviour as part of your argument!

3 billion is a positive number, so there is no argument in
principle against it having unsigned type; and the legacy argument
should respect the C89 precedent.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: allan_w@my-dejanews.com (Allan W)
Date: Thu, 13 Mar 2003 03:10:38 +0000 (UTC)
Raw View
> > If you have a decimal literal outside the
> > range 0 - 32767, then you do not know its
> > type on all systems. If you want to fix
> > its type, use a suffix.

stefan@slapeta.com ("Stefan Slapeta") wrote
> Sometimes you don't have the possibility to modify a part of your source
> code (consider generated sources!).

If you're generating sources, you can modify the generator to use
a suffix.

Worst-case, write a second-pass program that modifies the generated
code as needed before compilation.

OR, have the generator call a function that doesn't use overloading.
This could be an inline function that simply calls the "real"
function based on numeric range:
    int foo(int);
    long foo(long);
    unsigned long foo(unsigned long);
    unsigned long inline callFoo(unsigned long x) {
        if (x<=MAX_INT) return foo(int(x));
        if (x<=MAX_LONG) return foo(long(x));
        return foo(x); // Calls unsigned long version
    }
I suspect (but haven't tested) that when callFoo() is called with
a constant, the compiler will generate code that calls the correct
version of foo() directly.

> A suggestion: decimal integer literals

By "decimal integer literals" you mean a nonzero digit followed
by more digits, without a sign or decimal point or suffix. Right?

> get the leftmost type of int, long
> int, unsigned int, unsigned long int, <undefined>.
> (this wouldn't change much for decimal literals because everything after
> 'long int' is currently 'undefined').

If int and long are the same size, you'll never pick "long int"
or "unsigned long int." If int is smaller than long, you'll never
pick "unsigned int." Is this okay?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kanze@gabi-soft.de (James Kanze)
Date: Fri, 14 Mar 2003 17:24:31 +0000 (UTC)
Raw View
algrant@myrealbox.com (Al Grant) wrote in message
news:<5765b025.0303120438.7024ca46@posting.google.com>...
> kanze@gabi-soft.de (James Kanze) wrote in message
> news:<d6651fb6.0303110646.519cb8ae@posting.google.com>...
> > If you require unsgigned long, however, you run the risk of
> > accidentally forcing an expression to be evaluated as unsigned when
> > it isn't wanted:

> >     long i = -1 ;
> >     if ( i < 3000000000 ) ...

> > doesn't work as expected when I compile it in 32 bit mode.

> That's really an unfortunate result of the promotion rules and the
> lack of built-in operations between signed and unsigned types.

No.  The problem is that compilers aren't acting sensibly in the case of
undefined behavior.  The presence of the literal 3000000000 in a
compiler where a long has 32 bits introduces undefined behavior.  Since
this particular case of undefined behavior is easily detected at compile
time, it can only be perversity which causes the compiler to accept it
without a warning.  (G++ gives a warning.  Ideally, one would like an
error, but there is compatibility with pre-standard code to consider.)

> Comparison of a value of a signed type with a value of an unsigned
> type is well-defined; why is it that not only does the language not
> support it, it actually defines an attempt to write it as meaning
> something entirely different?

That's another question.  There are historical reasons...

There are other problems.  On a 32 bit 2's complement machine, try and
write the equivalent of 0x80000000 as a decimal number.  It can't be
done without invoking undefined behavior.

Of course, the place to really discuss this is comp.std.c.  C++ just
does whatever C does in such cases.

> It's all very well to argue (as you appear to be doing) for breaking
> legacy compatibility in the interests of safety - but it would be
> better if you did not have to use a piece of far more gross legacy
> behaviour as part of your argument!

I'm just basing myself on what the standard says.

> 3 billion is a positive number, so there is no argument in principle
> against it having unsigned type; and the legacy argument should
> respect the C89 precedent.

What about something like :

    if ( i < 3000000000000 ) ...

Are you suggesting that the compiler support some sort of unlimited
length integers, at least in constants?

--
James Kanze             GABI Software             mailto:kanze@gabi-soft.fr
Conseils en informatique orient   e objet/
                           Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, T   l. : +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: stefan@slapeta.com ("Stefan Slapeta")
Date: Mon, 17 Mar 2003 05:21:25 +0000 (UTC)
Raw View
James Kanze wrote:
>
> No.  The problem is that compilers aren't acting sensibly in the case of
> undefined behavior.  The presence of the literal 3000000000 in a
> compiler where a long has 32 bits introduces undefined behavior.  Since
> this particular case of undefined behavior is easily detected at compile
> time, it can only be perversity which causes the compiler to accept it
> without a warning.  (G++ gives a warning.  Ideally, one would like an
> error, but there is compatibility with pre-standard code to consider.)
>
I wouldn't go this way. My suggestion was: Make the undefined beheaviour
defined if this is possible. Here it IS possible. (I think that every
defined beheaviour is better than an undefined!) Why shouldn't a compiler
try unsigned types for literals that don't fit into the signed ranges?
Actually, this _IS_ already like most compilers behave today!! So, why don't
we include this into the standard?

> There are other problems.  On a 32 bit 2's complement machine, try and
> write the equivalent of 0x80000000 as a decimal number.  It can't be
> done without invoking undefined behavior.
>
YES! And why don't we define this? I can't still find any convincing reasons
against trying 'unsigned int' and 'unsigned long int' for (now) undefined
cases.

> Of course, the place to really discuss this is comp.std.c.  C++ just
> does whatever C does in such cases.
>
You are right - maybe we should invite the C group :-)

Regards,

Stefan


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kanze@gabi-soft.de (James Kanze)
Date: Tue, 18 Mar 2003 19:51:02 +0000 (UTC)
Raw View
stefan@slapeta.com ("Stefan Slapeta") wrote in message
news:<3e74c99c$0$24806$91cee783@newsreader01.highway.telekom.at>...
> James Kanze wrote:

> > No.  The problem is that compilers aren't acting sensibly in the
> > case of undefined behavior.  The presence of the literal 3000000000
> > in a compiler where a long has 32 bits introduces undefined
> > behavior.  Since this particular case of undefined behavior is
> > easily detected at compile time, it can only be perversity which
> > causes the compiler to accept it without a warning.  (G++ gives a
> > warning.  Ideally, one would like an error, but there is
> > compatibility with pre-standard code to consider.)

> I wouldn't go this way. My suggestion was: Make the undefined
> beheaviour defined if this is possible. Here it IS possible. (I think
> that every defined beheaviour is better than an undefined!) Why
> shouldn't a compiler try unsigned types for literals that don't fit
> into the signed ranges?  Actually, this _IS_ already like most
> compilers behave today!! So, why don't we include this into the
> standard?

Maybe because is results in some curious equalities:

    int i = -1 ;
    if ( i < 3000000000 ) ...   //  false
    if ( i < 2000000000 ) ...   //  true

Some people find that a bit subtle.

> > There are other problems.  On a 32 bit 2's complement machine, try
> > and write the equivalent of 0x80000000 as a decimal number.  It
> > can't be done without invoking undefined behavior.

> YES! And why don't we define this? I can't still find any convincing
> reasons against trying 'unsigned int' and 'unsigned long int' for
> (now) undefined cases.

> > Of course, the place to really discuss this is comp.std.c.  C++ just
> > does whatever C does in such cases.

> You are right - maybe we should invite the C group :-)

Invite, or invade? :-)  But I think you'll find that the problem is less
evident than you think.  IMHO, g++ is right to warn.

--
James Kanze             GABI Software             mailto:kanze@gabi-soft.fr
Conseils en informatique orient   e objet/
                           Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, T   l. : +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: stefan@slapeta.com ("Stefan Slapeta")
Date: Mon, 10 Mar 2003 21:44:18 +0000 (UTC)
Raw View
Hi everybody,

I would like to start a discussion about the types of integer literals.

[2.13.1.2] says about an integer literal:

If it is decimal ... it has the first of these types in which its value can
be represented: int, long int; if the value cannot be represented as a long
int, the behaviour is undefined.
If it is octal or hexadecimal ... it has the first of these types ... : int,
unsigned int, long int, unsigned long int.

This sound very strange to me for two reasons:

1)
Why is there a different behaviour for decimal and non-decimal integer
literals?

2)
I think that the types for decimal integer literals are not sufficient.

Consider the following example on a 32 bit system (the problem similar with
other systems):

class MyClass {
   double m_val;
public:
   MyClass(double dStart = 0) : m_val(dStart) {}

   MyClass operator + (int x) {
      return MyClass(m_val + x);
   }

   MyClass operator + (double x) {
      return MyClass(m_val + x);
   }
};

int main() {
   MyClass  a, b;

   a = b + 1;                 // ok, calls operator + (int)
   a = b + 1.7;              // ok, calls operator + (double)

   a = b + 3000000000; // ERROR: ambigous! what is the type of 3000000000 ??
}


The problem here is, that you cannot provide an operator + for the given
literal that is ok for every compiler because the standard says that the
'behaviour is undefined' in this case (the implementations I know give this
literal the type 'unsigned long int').

Stefan


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: hyrosen@mail.com (Hyman Rosen)
Date: Mon, 10 Mar 2003 22:26:01 +0000 (UTC)
Raw View
Stefan Slapeta wrote:
> The problem here is

The problem here is that you're running
a program with undefined behavior. The
reason for having different behavior is
that existing code was sprinkled with hex
and octal constants which had their high
bits set, and which therefore would be
unrepresentable as int or long.

If you have a decimal literal outside the
range 0 - 32767, then you do not know its
type on all systems. If you want to fix
its type, use a suffix.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kanze@gabi-soft.de (James Kanze)
Date: Tue, 11 Mar 2003 19:54:04 +0000 (UTC)
Raw View
stefan@slapeta.com ("Stefan Slapeta") wrote in message
news:<3e6cf026$0$37376$91cee783@newsreader01.highway.telekom.at>...

> I would like to start a discussion about the types of integer literals.

> [2.13.1.2] says about an integer literal:

> If it is decimal ... it has the first of these types in which its
> value can be represented: int, long int; if the value cannot be
> represented as a long int, the behaviour is undefined.  If it is octal
> or hexadecimal ... it has the first of these types ... : int, unsigned
> int, long int, unsigned long int.

> This sound very strange to me for two reasons:

> 1)
> Why is there a different behaviour for decimal and non-decimal integer
> literals?

Programmer expectations.  Generally, you expect signed integer
behavior, which is favored by the above.  When you write something like
0x8000000 (on a 32 bit machine), however, you are probably trying to
mask the top bit of an int (signed or not); you don't want all of your
operations to suddenly go off into long (or long long).

> 2)
> I think that the types for decimal integer literals are not sufficient.

> Consider the following example on a 32 bit system (the problem similar
> with other systems):

> class MyClass {
>    double m_val;
> public:
>    MyClass(double dStart = 0) : m_val(dStart) {}
>
>    MyClass operator + (int x) {
>       return MyClass(m_val + x);
>    }

>    MyClass operator + (double x) {
>       return MyClass(m_val + x);
>    }
> };

> int main() {
>    MyClass  a, b;
>
>    a = b + 1;                 // ok, calls operator + (int)
>    a = b + 1.7;              // ok, calls operator + (double)

>    a = b + 3000000000; // ERROR: ambigous! what is the type of 3000000000 ??
> }

> The problem here is, that you cannot provide an operator + for the
> given literal that is ok for every compiler because the standard says
> that the 'behaviour is undefined' in this case (the implementations I
> know give this literal the type 'unsigned long int').

I'm not sure what you are asking for.  What should the type of
3000000000 be?  On a machine with 32 bit longs, the user has invoked
undefined behavior.  On a machine with 64 bit longs, the type is long,
and you still have an ambiguous call.

I agree that it would be nice if it wasn't undefined behavior; either
you get a defined type, or the compiler complains.  But it is a
dangerous game.  If you simply replace the undefined behavior with
requiring an error, you break programs which currently compile: as you
point out, using unsigned long is a frequent extension.  If you require
unsgigned long, however, you run the risk of accidentally forcing an
expression to be evaluated as unsigned when it isn't wanted:

    long i = -1 ;
    if ( i < 3000000000 ) ...

doesn't work as expected when I compile it in 32 bit mode.  (No problem
in 64 bit mode.)

As a quality of implementation issue, undefined behavior that is
detectable at compile time should at least get a warning.  I think it
worth a complaint to your vendor.  G++ says:

   bigtype.cc:34: warning: decimal constant is so large that it is unsigned

As a quality of implementation issue, it's just what the doctor ordered.

--
James Kanze             GABI Software             mailto:kanze@gabi-soft.fr
Conseils en informatique orient   e objet/
                           Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, T   l. : +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: stefan@slapeta.com ("Stefan Slapeta")
Date: Tue, 11 Mar 2003 19:55:37 +0000 (UTC)
Raw View
> The problem here is that you're running
> a program with undefined behavior. The
> reason for having different behavior is
> that existing code was sprinkled with hex
> and octal constants which had their high
> bits set, and which therefore would be
> unrepresentable as int or long.
I could say the same argument about decimal literals that have their high
bits set!

>
> If you have a decimal literal outside the
> range 0 - 32767, then you do not know its
> type on all systems. If you want to fix
> its type, use a suffix.
>
Sometimes you don't have the possibility to modify a part of your source
code (consider generated sources!). With my example I wanted to show that
you currently don't have any chance to provide an operator (commonly spoken:
a correct function signature) for ALL integer literals. For me it's not
really satisfying that you have to rely on your compiler implementation in
this point.
A suggestion: decimal integer literals get the leftmost type of int, long
int, unsigned int, unsigned long int, <undefined>.
(this wouldn't change much for decimal literals because everything after
'long int' is currently 'undefined').
Unfortunately, this wouldn't unify the treatment of _all_ types of literals.

Stefan



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]