Topic: Proposed Enhancement to select/case (yes, I know...)
Author: henry@zoo.toronto.edu (Henry Spencer)
Date: 30 Aug 90 16:46:10 GMT Raw View
In article <BURLEY.90Aug30030645@world.std.com> burley@world.std.com (James C Burley) writes:
>... how about this: allow ranges (and, perhaps, lists) on case statements.
Such a feature appeared in one draft of ANSI C, and disappeared in the
next. I believe the reason was the usual: there was no implementation
experience with it, and it was a minor convenience rather than a solution
to a serious problem.
--
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry
Author: burley@world.std.com (James C Burley)
Date: 30 Aug 90 10:06:45 GMT Raw View
After seeing some of the discussions about what can't (or shouldn't) be added
to C/C++ select/case construct capabilities, I'd like to post my thoughts on
something that might actually be useful and should be easy to implement.
First off, this proposal applies to both C++ and C. I don't suggest adding
any new types; run-time expressions on cases; class types on selects/cases; or
anything like that. Plenty of people have provided excellent explanations of
why those features would not be desirable.
But how about this: allow ranges (and, perhaps, lists) on case statements.
I'll pick a syntax for now -- using brackets, though perhaps someone has a
better idea. Here's an example:
select(foo) // Nothing new here.
{
case ALPHA: // Nothing new here.
...
case [BETA:GAMMA]: // Matches any value for foo where BETA<=foo<=GAMMA.
...
case [:MINIMUM,MAXIMUM:]: // Matches foo<=MINIMUM or foo>=MAXIMUM.
...
case [EPSILON,OMEGA]: // like "case EPSILON: case OMEGA:".
...
case [DELTA:PI,TAU]: // like "case [DELTA:PI]: case TAU:".
...
}
To summarize, a case statement may have a comma-separated list of
case-range-exprs (at least one item in the list) within brackets instead of
the usual integral constant expression. Each case-range-expr is either an
integral constant expression (called "int-expr" subsequently), "int-expr:",
":int-expr", or "int-expr:int-expr".
Within a switch construct, only one case-range-expr with the form ":int-expr"
is permitted, and only one with the form "int-expr:" is permitted. If two
case-range-exprs exist, one with the each of these forms, "default" is not
permitted (or, it could be interpreted as a null range, described below, if
people want).
If a given case-range-expr effectively specifies a null range, it is considered
a null case range. Null case ranges are permitted. A null range occurs if
the int-expr in ":int-expr" is less than the minimum value representable
by the type of the expression in the select statement; if the int-expr in
"int-expr:" is greater than the maximum value representable; or, for a range
of the form "lowest:highest", lowest>highest, lowest>maximum-value, or
highest<minimum-value.
Enough technotrivia: the only really useful new features are the ":i", "i:"
range forms to designate things "default" currently cannot, and the "i:j"
form to designate gaps in constants whose actual values are defined by
"someone else" (some #include file). The ability to use commas to make
lists is purely for convenience.
I believe nothing about this proposal is at all difficult to implement in
terms of parsing or generating code. Compilers generating tables would have
to specially handle the open-ended ranges (":i" and "i:"), but my guess is
their implementations of default with such tables are already very close.
Finally, we'd all be able to write (yeah, here's the good part):
switch(c)
{
case ['A':'Z','a':'z']:
// letter
case ['0':'9']:
// digit
case [:' '-1,'~'+1:]:
// unprintable character
}
This is easier than listing everything using lots of "case" statements, and
(usually) faster than using <ctype.h> functions (macros). I'm not going to
say it's any more portable, however; in fact it is less portable than using
<ctype.h> or just listing all the letters (and digits?) individually. It's
a "hacker's example" of the utility of case ranges; a more realistic example
is hinted at in the first sample in this posting (where the source code
containing the select is not under the same control as that defining the
constants in the case statements).
A thought: although I still shiver at the idea of allowing float or
double types on a select, the availability of ranges somewhat lessens the
arguments against it: the language could disallow any case specification
(other than, perhaps, the constant 0.) was not a range. However, once one
has floats, one next wants ways to say "foo<0.", "foo==0.", "foo>0.",
for example, somehow extending the syntax to specify "<" or ">" comparisons
instead of "<=" or ">=". And I think that suggests floats are going too far.
Anyway, the basic idea of case ranges (and lists of cases) comes from Fortran
90, as many of you already know. Please don't assume that it therefore is a
bad idea!
Does anyone know any good reasons NOT to implement some or all of these
features in today's C++ and C compilers, with an eye towards codifying them
in the next ANSI standards for these languages? I wouldn't suggest putting
them into the standards until people had had a couple of years to try them
out in real code...if they're not useful, let's not add the extra baggage; at
least that's my philosophy as a C programmer!
James Craig Burley, Software Craftsperson burley@world.std.com
Author: burley@world.std.com (James C Burley)
Date: 1 Sep 90 07:52:53 GMT Raw View
In article <1990Aug31.134248@ee.ubc.ca> mikeb@ee.ubc.ca (Mike Bolotski) writes:
In article <1990Aug30.164610.3519@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes:
> In article <BURLEY.90Aug30030645@world.std.com> burley@world.std.com (James C Burley) writes:
> >... how about this: allow ranges (and, perhaps, lists) on case statements.
>
> Such a feature appeared in one draft of ANSI C, and disappeared in the
> next. I believe the reason was the usual: there was no implementation
> experience with it, and it was a minor convenience rather than a solution
> to a serious problem.
From the G++ info file:
Switch Ranges
=============
A GNU C++ extension to the switch statement permits range specification
for case values. For example, below is a concise way to print out
a function parameter's "character class:"
print_char_class (char c)
{
switch (c)
{
case 'a'..'z': printf ("lower case\n"); break;
case 'A'..'Z': printf ("upper case\n"); break;
case '0'..'9': printf ("digit\n"); break;
default: printf ("other\n");
}
}
Duplicate, overlapping case values and empty ranges are detected and
rejected by the compiler.
--
Mike Bolotski VLSI Laboratory, Department of Electrical Engineering
mikeb@salmon.ee.ubc.ca University of British Columbia, Vancouver, Canada
Ok, great, then we can all forget about my original recommendation. Right
after the posting I began thinking that the [x:y]: syntax I proposed was
obnoxious, and though it isn't really ambiguous (because "?" isn't by itself
an operator), use of the colon in this new way might bother some people (though
of course it already is used in this way -- consider "case x?y:z:", where x,
y, and z are all constants). I thought maybe "..." or ".." would be a better
separator.
Sure enough, somebody emailed me with the very same comments (not a user of
GNU CC), and we entered into a discussion of various syntax possibilities.
(One issue was that the list feature, as in "case 1,2,3:", would make a second
case I know of where the comma operator is "turned off" to support a different
use of comma and thus requires an expression using the comma operator in that
context to be placed in parens -- the first case is "foo(1,2,3)", a function
invocation.)
So I was considering reposting my proposal with this modified syntax, but
here somebody points out that GNU CC already has it!! This is all that is
needed. As I said (at the end of my original posting), I expect this feature
would be added to the C/C++ standard only if existing practice proved it
useful. GNU CC establishes existing practice: useful is established (later)
by comments received by the standards committee, or perhaps seeing more vendors
of other C compilers add the same features to be GNU compatible (due presumably
to demand by customers/marketing).
I still have two questions: does "case ..MINIMUM:" match any value of the
expression less than or equal to MINIMUM, and "case MAXIMUM..:" accordingly
match expr>=MAXIMUM and, if not, would these be useful features?
Also, why didn't anybody point out how stupid it was for me to say that if
both minimum and maximum cases were specified (i.e. "case ..MINIMUM:" and
"case MAXIMUM..:" in a switch), "default" could not be specified! (Sigh, I
forgot about the need to catch "hole" values not specified by "case"
statements for values between MINIMUM and MAXIMUM....) In any case, thanks
for not jumping all over me about that.
Someday I've got to get a UNIX box and start using GNU software.
James Craig Burley, Software Craftsperson burley@world.std.com
Author: mikeb@ee.ubc.ca (Mike Bolotski)
Date: 31 Aug 90 20:42:48 GMT Raw View
In article <1990Aug30.164610.3519@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes:
> In article <BURLEY.90Aug30030645@world.std.com> burley@world.std.com (James C Burley) writes:
> >... how about this: allow ranges (and, perhaps, lists) on case statements.
>
> Such a feature appeared in one draft of ANSI C, and disappeared in the
> next. I believe the reason was the usual: there was no implementation
> experience with it, and it was a minor convenience rather than a solution
> to a serious problem.
Author: henry@zoo.toronto.edu (Henry Spencer)
Date: 1 Sep 90 22:43:36 GMT Raw View
In article <1990Aug31.134248@ee.ubc.ca> mikeb@salmon.ee.ubc.ca writes:
> case 'A'..'Z': printf ("upper case\n"); break;
Heh heh. What does this do on an EBCDIC machine? Not what you think!
POSIX 1003.2, which unlike the GNoids has seriously *thought* about the
problem, eventually decided that use of such ranges was inherently
unportable. It also has a problem in that it is Anglocentric: the
intent is presumably to pick up all uppercase alphabetics, but that
is *not* necessarily A through Z. 1003.2 has made some minor extensions
to regular-expression syntax, for example, so you can really say "match
any uppercase alphabetic", rather than saying "match A through Z" and
hoping that's right.
--
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry
Author: gwyn@smoke.BRL.MIL (Doug Gwyn)
Date: 2 Sep 90 02:29:25 GMT Raw View
In article <1990Aug31.134248@ee.ubc.ca> mikeb@salmon.ee.ubc.ca writes:
>From the G++ info file:
>For example, below is a concise way to print out a function parameter's
>"character class:"
[example deleted]
What the G++ info file doesn't tell you is that that example is a mistake!
Author: rfg@NCD.COM (Ron Guilmette)
Date: 1 Sep 90 17:12:07 GMT Raw View
In article <BURLEY.90Aug30030645@world.std.com> burley@world.std.com (James C Burley) writes:
<
<...But how about this: allow ranges (and, perhaps, lists) on case statements.
There are two separate questions here. One concerns case ranges, and the
other concerns lists. I perfer to talk about each separately.
Regarding case ranges, my personal feeling is that this is a good and useful
feature and that it ought to make its way into the final standard for C++.
It is my understanding that this feature was proposed during the ANSI C
deliberations, but that it was rejected by X3J11 (for reasons that I'm not
clear on). Perhaps someone who served on X3J11 could give us a quick
summary about what happened to case ranges in X3J11.
Anyway, ANSI C standard or no, some C compiler vendors do offer this feature.
The only one I'm sure offers it is MetaWare.
In the C++ would, g++ offers case ranges. This should prevent X3J16 members
from rejecting the idea outright because of a lack of prior art.
By the way, I believe that both MetaWare C and g++ implement case ranges via
a syntax like:
case LOW..HIGH:
Regarding "case lists", it seems clear that a simple syntax, i.e.:
case FOO,BAR:
won't work because the separator comma could be parsed as the normal comma
operator, which would mess up everything. James proposes:
case [FOO,BAR]:
which is more verbose than the simple syntax but which still saves a bit of
typing relative to:
case FOO:
case BAR:
The only problem is that the values used to designate cases should (if reason
prevails) still be allowed to be formed from "static constant expressions"
and thus, the problem of the comma being parsed as a comma operator still
plagues this syntax also.
Anyway, it looks like developing a special syntax for "case lists" may be
more trouble than it is worth.
<I believe nothing about this proposal is at all difficult to implement in
<terms of parsing or generating code.
Correct.
<Anyway, the basic idea of case ranges (and lists of cases) comes from Fortran
<90, as many of you already know.
I think not. I'm pretty sure that case ranges appeared in Pascal long
before anybody in FORTRAN land ever though of the idea.
<Does anyone know any good reasons NOT to implement some or all of these
<features in today's C++ and C compilers, with an eye towards codifying them
<in the next ANSI standards for these languages?
Apparently, Michael Tiemann could not think of any serious problem with
case ranges, so he went forth and implemented them.
<I wouldn't suggest putting
<them into the standards until people had had a couple of years to try them
<out in real code...
Right. Given that this feature has already been in g++ for some time,
you may consider it "prior art".
--
// Ron Guilmette - C++ Entomologist
// Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg
// Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
Author: burley@world.std.com (James C Burley)
Date: 4 Sep 90 11:58:14 GMT Raw View
In article <1990Sep1.224336.22846@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
In article <1990Aug31.134248@ee.ubc.ca> mikeb@salmon.ee.ubc.ca writes:
> case 'A'..'Z': printf ("upper case\n"); break;
Heh heh. What does this do on an EBCDIC machine? Not what you think!
I think if you look at my original posting, you'll see a kind of backhand
mention of this issue (I said something about it not being any more portable
than a switch with a bunch of cases, in fact maybe less portable).
Anyway, in case you missed it, the issue has been "decided". GNU C already
implements such a construct; and it is a widely used compiler. When the
next standard begins "happening", it will be up to the committee (with all
our input) to determine whether ranges (and lists) are useful and simple enough
to add, compared to the costs of any possible lack of portability being
introduced.
Meanwhile, I personally would not use the range feature for the above case
except in some kind of quick&dirty throwaway program. I'm looking to it
more as an elegant way to deal with natural (i.e. portable) ranges that occur
in applications, and cases where the switch (not select, as I'm wont to say
as a Fortran-90 victim) statement and it's case statements are being written
by someone who does not (or should not/doesn't want to) know the values of
#define constants for the cases (presumably kept in an #include file) but
still wants to handle ranges. Nothing wrong with that, I wouldn't think.
James Craig Burley, Software Craftsperson burley@world.std.com
Author: burley@world.std.com (James C Burley)
Date: 4 Sep 90 12:47:15 GMT Raw View
In article <1420@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
In article <BURLEY.90Aug30030645@world.std.com> burley@world.std.com (James C Burley) writes:
<
<...But how about this: allow ranges (and, perhaps, lists) on case statements.
There are two separate questions here. One concerns case ranges, and the
other concerns lists. I perfer to talk about each separately.
...
By the way, I believe that both MetaWare C and g++ implement case ranges via
a syntax like:
case LOW..HIGH:
Yes, my "[low:high]:" syntax is lousy anyway, I like the ".." syntax better.
Do Metaware C and/or g++ allow "..LOW" by itself or "HIGH.." by itself?
Regarding "case lists", it seems clear that a simple syntax, i.e.:
case FOO,BAR:
won't work because the separator comma could be parsed as the normal comma
operator, which would mess up everything. James proposes:
case [FOO,BAR]:
which is more verbose than the simple syntax but which still saves a bit of
typing relative to:
case FOO:
case BAR:
My reading of the ANSI C standard tells me "case FOO,BAR:" is invalid because
the case expression must be an integer constant expression and an integer
constant expression may not (in essence) contain any operators at "assignment"-
level precedence and below (including the comma operator). I even tried this
under THINK C 4.0, and it rejected the comma in that context.
I'm not, however, chomping at the bit to create yet another case where comma
might have a different meaning than otherwise (like "foo(1,2,3);", where the
commas must be interpreted as argument separators), even though that would
be the case only if later we wanted to extend the switch statement to handle
general-expression cases (unlikely, but who knows, maybe PL/I programmers
will take over the ANSI C committee :-).
Since it's purely a convenience feature, as compared to ranges, I'd say let's
forget about lists. They're nice and neat to the "unaided eye", but why invite
hassle with future generations of C any more than we have to?
<Anyway, the basic idea of case ranges (and lists of cases) comes from Fortran
<90, as many of you already know.
I think not. I'm pretty sure that case ranges appeared in Pascal long
before anybody in FORTRAN land ever though of the idea.
My statement was very poorly worded, and further reflects my complete ignorance
of Pascal. What I meant to say was "the basic idea ... COMES TO ME from
Fortran 90...". Wanted to give credit where credit was due: plus, the
obnoxious [1:2]: syntax I picked as a sample comes directly from '90, save for
using brackets instead of parens and adding the trailing colon.
Also, I don't know whether Pascal has the "..LOWER", "UPPER.." range forms that
match x<=LOWER and UPPER<=x, respectively. Those are what really "excite" me;
they're the only part of the whole set of features we've been discussing that
one cannot reliably code around even knowing the actual constant values,
without adding extra "ifs" that some compilers might not know enough to
fold in with their own range-checks on the switch/case domain. Or whatever.
<Does anyone know any good reasons NOT to implement some or all of these
<features in today's C++ and C compilers, with an eye towards codifying them
<in the next ANSI standards for these languages?
Apparently, Michael Tiemann could not think of any serious problem with
case ranges, so he went forth and implemented them.
What does FORTH have to do with this? (-:
<I wouldn't suggest putting
<them into the standards until people had had a couple of years to try them
<out in real code...
Right. Given that this feature has already been in g++ for some time,
you may consider it "prior art".
So considered! Thanks!
// Ron Guilmette - C++ Entomologist
// Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg
// Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
James Craig Burley, Software Craftsperson burley@world.std.com
Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 4 Sep 90 21:13:50 GMT Raw View
burley@world.std.com (James C Burley) writes:
>Anyway, in case you missed it, the issue has been "decided". GNU C already
>implements such a construct; and it is a widely used compiler. When the
>next standard begins "happening", it will be up to the committee (with all
>our input) to determine whether ranges (and lists) are useful and simple enough
>to add, compared to the costs of any possible lack of portability being
>introduced.
No. GCC does not implement that construct (at least version 1.37.1
does not, and I think this is about the newest version (although I heard
about 1.37.92 -- what's that?)), but G++ does.
And the fact that ONE _C++_ compiler implements a feature is not likely to
count as prior art of_C_ .
>Meanwhile, I personally would not use the range feature for the above case
>except in some kind of quick&dirty throwaway program. I'm looking to it
>more as an elegant way to deal with natural (i.e. portable) ranges that occur
>in applications, and cases where the switch (not select, as I'm wont to say
>as a Fortran-90 victim) statement and it's case statements are being written
>by someone who does not (or should not/doesn't want to) know the values of
>#define constants for the cases (presumably kept in an #include file) but
>still wants to handle ranges. Nothing wrong with that, I wouldn't think.
Agreed. Ranges do have their advantages, and although I do not need
them (I can use if () else if () ..., but then I do not need case (),
either), I would not mind having them in a C compiler (GNU people, are
you listening ?)
--
| _ | Peter J. Holzer | Think of it |
| |_|_) | Technische Universitaet Wien | as evolution |
| | | | hp@vmars.tuwien.ac.at | in action! |
| __/ | ...!uunet!mcsun!tuvie!vmars!hp | Tony Rand |
Author: cowan@marob.masa.com (John Cowan)
Date: 4 Sep 90 15:17:10 GMT Raw View
In article <1420@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>Regarding "case lists", it seems clear that a simple syntax, i.e.:
>
> case FOO,BAR:
>
>won't work because the separator comma could be parsed as the normal comma
>operator, which would mess up everything.
Other posters have also asserted this. I don't see it.
K&R2, section A7.19, says "Constant expressions may not contain ... comma
operators." Case expressions are constant expressions. Therefore the
overloading of commas in a "case list" construction is not ambiguous,
although it might confuse humans.
Other posters have also pointed out that ranges of characters such as
'A' .. 'Z' are inherently unportable w/r/t character set. This is true,
but ranges of enums remain safe and useful, since enums are guaranteed to
be assigned values left-to-right increasing by 1. A large enum declaration
may be listed in order of logical groups, and then a range could be used to
test whether or not a particular enum variable had a range within the group.
--
cowan@marob.masa.com (aka ...!hombre!marob!cowan)
e'osai ko sarji la lojban
Author: karl@haddock.ima.isc.com (Karl Heuer)
Date: 5 Sep 90 00:33:22 GMT Raw View
In article <1420@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>Regarding "case lists", it seems clear that a simple syntax, i.e.:
> case FOO,BAR:
>won't work because the separator comma could be parsed as the normal comma
>operator, which would mess up everything.
It would mean that, in the unlikely event that the user *wants* a comma
operator in a case label, he has to parenthesize the expression. No different
from the existing situation with function arguments or initializer lists,
except that it's a minor incompatibility with the existing language.
>Right. Given that this feature has already been in g++ for some time,
>you may consider it "prior art".
Is there any particular reason this got added to g++ but not gcc? It would
seem to be equally useful in both.
Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint
Author: daveg@near.cs.caltech.edu (Dave Gillespie)
Date: 5 Sep 90 00:56:16 GMT Raw View
>>>>> On 1 Sep 90 22:43:36 GMT, henry@zoo.toronto.edu (Henry Spencer) said:
> In article <1990Aug31.134248@ee.ubc.ca> mikeb@salmon.ee.ubc.ca writes:
>> case 'A'..'Z': printf ("upper case\n"); break;
> Heh heh. What does this do on an EBCDIC machine? Not what you think!
Here's a solution: In <ctype.h> define macros which expand to the
necessary case ranges and/or lists for various subsets of the target
character set: Lowercase alpha, uppercase alpha, and digits.
A somewhat ugly but workable underlying case syntax would be, e.g.:
case 'a'..z'; 'A'..'Z'; '_': begin_c_identifier(); break;
One flaw with this is that isalpha() can be defined to be run-time
configurable for different national alphabets, but macros cannot.
Perhaps the "right" extension is to allow arbitrary predicates in
a case label ("case isalpha:"), but I can't think of a really nice
syntax for it, especially if you want to allow macros to be used as
the predicate.
-- Dave
--
Dave Gillespie
256-80 Caltech Pasadena CA USA 91125
daveg@csvax.cs.caltech.edu, ...!cit-vax!daveg
Author: ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe)
Date: 6 Sep 90 10:06:27 GMT Raw View
In article <BURLEY.90Sep4054715@world.std.com>, burley@world.std.com (James C Burley) writes:
(quoting someone else)
> I think not. I'm pretty sure that case ranges appeared in Pascal long
> before anybody in FORTRAN land ever though of the idea.
> Also, I don't know whether Pascal has the "..LOWER", "UPPER.." range forms that
> match x<=LOWER and UPPER<=x, respectively.
Case ranges are not part of ISO standard Pascal *AT* *ALL*.
The new "Pascal Extended" may well have them, but the old standard, no.
Case ranges have been *much* discussed in SigPlan Notices and elsewhere
for the last 20 years. (They _would_ have fitted very nicely into Pascal
if only case labels had been constant set expressions.)
Case ranges *are* however part of Ada, where a "case label" is
'when' <choice> {| <choice>}... (LRM 5.4)
and <choice> ::= simple_expression (LRM 3.7.3)
| simple_expression .. simple_expression
| subtype_name -- defined as a range
| range_valued_attribute
| 'others'
So case Ch is
when 'a' .. 'z' | 'A' .. 'Z' => Letter;
when '0' .. '9' => Digit;
when others => NonAlNum;
end case;
No, Ada does not have ..upper or lower.., however it is always
possible to use
T'FIRST..upper for ..upper
lower..T'LAST for lower..
where T is the type of the case expression. In C or C++ it would
suffice to use
case INT_MIN .. upper: for case .. upper:
case lower .. INT_MAX: for case lower ..:
I don't say it's as pretty, but INT_MIN and INT_MAX are already there.
--
You can lie with statistics ... but not to a statistician.
Author: zlsiial@mcc.ac.uk (A.V. Le Blanc)
Date: 7 Sep 90 09:44:45 GMT Raw View
The range notation in case statements was allowed in the later
versions of the ETH Pascal 6000 compiler. Since this compiler
was very influential in the development of other implementations,
there were probably several which followed it in offering this
(non-standard) extension.