Thread

Topic: Variants

Author: reindorf@us-es.sel.de (Charles Reindorf)
Date: Mon, 24 May 93 12:23:24 GMT Raw View

I read with interest the discussions about variants.

I do not remember reading about the combinatorial problems in dealing
with statements containing many variants: If a statment contains m
different variant values of a variant with m classes, the compiler
effectively expands this into a multi-select with n^m alternatives. In
most cases, m is very small (1 or 2) so this is not much of a problem
in the domain for which variants are intended. But this is a proposal
which is going to go out into the big, wide world. What is going to
happen when somebody chucks in piece of code which results in a
combinatorial explosion?

Yours Interestendly,

Charles Reindorf

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Thu, 27 May 1993 19:37:13 GMT Raw View

In article <1993May24.122324.3662@us-es.sel.de> reindorf@us-es.sel.de (Charles Reindorf) writes:
>I read with interest the discussions about variants.
>
>I do not remember reading about the combinatorial problems in dealing
>with statements containing many variants: If a statment contains m
>different variant values of a variant with m classes, the compiler
>effectively expands this into a multi-select with n^m alternatives. In
>most cases, m is very small (1 or 2) so this is not much of a problem
>in the domain for which variants are intended. But this is a proposal
>which is going to go out into the big, wide world. What is going to
>happen when somebody chucks in piece of code which results in a
>combinatorial explosion?
>

 The compiler will optimise it so much better than
the programmer could have by hand.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Thu, 20 May 1993 20:27:17 GMT Raw View

In article <9314018.9546@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>
>Yes, you can catch don't care types using "...".
>
>I'm not sure whether the proposal says anything explicit about using "..."
>in *multi-selects*, but it would make sense, so it should be made explicit
>if it's not already so.

 Mm. I bungled *my* answer to this. The idea is not to
push the language defintion: I allowed '...' in single selects
because its allowed in function calls.

 Have to think about this for multi-selects: they are
really intended to work the same: function calls.

 select( xxxxxx ) <----arguments
 {
  type( yyyyyy ) <---- parameters
  { zzzzz }    <----- function body (nested function at that)

>
>>  v1 = (V1)v2;   // if not, does this help??
>
>No. Variants are sets of types, not types, so you can't cast to a variant.

 Sure you can:-)

 The basic rule is for initialisation, and parameter passing
and casting are nothing more than initislisation.

 The example is wrong because the case of 'MyClass ..'
cant cast to either int or long.

>
>>  i=(V3)v2;      // if the last line is illegal, does this help?
>
>No, you can't cast to a variant.
>
>Essentially, variants don't support assignment.

 Essentially, variants *only* support initialisation,
which is 'type-binding' plus ordinary object initialisation,
plus, they support all other operations by generating
code for all the cases and selecting on the bound type.

 Sort of like a pair: (tag,union)

 enum {T1, T2, T3} varytag;
 union {
  T1 t1;
  T2 t2;
  T3 t3;
 };

 const varytag& = T1;
 t1=initvalue_of_type_T1;

The select statement is just like a switch on the tag.

Well: the above three ideas are it. The rest more or less follows :-)
So you *can* assign to a variant ... its just another operation,
and it selects on both the source and destination. But you cant
change the tag.

 For example:

 variant Num [int, long , char*];
 Num n = "Hello";

 g(Num n){
  n = 5; // error, cant assign 5 to a char *

Be careful:

 n = 0 ; // OK!, sets pointer to 0

Note:

 Num x = 1;
 g(Num x){
  x = 5; // error: cant assign 5 to a char *

BUT:
 Num x = 1;
 x = 5; // OK: x is known to be an int.

>A variants type tag
>is set when it is initialized and it can't be modified afterwards.
>Variants are a bit like references in this respect.

 Yes. Except, if the type is known, there neednt be
any tag. In fact, try this:

 declare x = 1;

where 'declare' is a system defined variant for any type at all!
What type is 'x'? An int of course!
>
>BUT you can use a variant to *implement* a discriminated union class
>that *does* support assignment :-)
>More on this later...
--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Thu, 20 May 1993 21:00:32 GMT Raw View

In article <1993May19.171939.2048@rcmcon.com> rmartin@rcmcon.com (Robert Martin) writes:
>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>
>>In article <1993May17.163017.7055@uunet.uu.net!rcm> rmartin@uunet.uu.net!rcm (Robert Martin) writes:
>
>Aha!  I begin to see what your viewpoint is.  A variant is a way of
>coallescing a set of types under a single "name".  It's as if you
>could give all those types a common base class, except that you don't
>supply any interfaces for them.

 Half way there :-)

 The "name" is irrelevant.

 A variant is a way of declaring a symbol to be of one or
more types .. to be bound later.

 That is:

 f(variant [int, long] x) { .. }

is perfectly legal and equivalent to

 variant num [int, long];
 f(Num x) { ... }
>
>Lets take one step forward on this.  Lets say that you COULD supply
>interfaces for variants.

 Well, I'll read on .. but the idea is that a variant
is a defered binding to a particular type: at run time,
a symbol declared variant is used as if it had been declared
as the type it actually is.

 In fact, if the compiler can deduce the type, it
doesnt bother with the variant: type inference.

 variant [int, long] x = 1L;
 ... x ...

'x' is a long. An ordinary long. Not special in any way.
If you wrote:

 long x 1L;
 ... x ....

no one would be the wiser. The defered binding only occurs
when you have variant parameters to functions... and not even
then, because a function with a variant parameter is exactly
like several *ordinary* overloaded functions.

So if you get the idea variants are just a sort of smart macro ..
you would be right. Thats what they really are :-)
>
>Variant X [A, B, C]
>{
>   void f()
>   {
>       Select
>       {
>         A: {...}
>         B: {...}
>         C: {...}
>       }
>   }
>};
>
>This allows you to specify "member functions" or interfaces for
>variants.  I would strongly recommend that Select statements can only
>be used within the "member functions" of Variants.  This localizes all
>the code that deals with the variant to a single place, and mollifies
>my fears about hunting down Selects which are scattered through the
>code.

 It does, but that defeats the purpose of variants, which
is precisely to *enable* spreading the code all round the place.

 There's a paradigm shift needed here: every advantage
of classes is a disadvantages of variants. Conversely,
each disadvantage of classes is an advantage of variants.
(Well, not all: the have type safety in common :-)

 The point of variants, then is to provide a facility
that allows doing things that classes are not good at.

 You can spread code for classes all round the place too:
what else are global functions that have class types as
parameters?

 Well, at least with the select statement those
functions are

 a) anonymous
 b) nested

so they dont cause namespace pollution, they are sensitive to
context, and they dont violate encapsulation of the component
classes.

 If you argue that the encapsulation of the *variant*
is violated I say: no its not: variants dont have
*any* encapsulation. They're not even types: no two variants
can be of the same type because of this.

>Yes, I understand this now.  Variants are type safe as long as there
>are no "default" or "ignore" clauses in Select statements.  I.e. every
>type within the variant must be expressed within every Select
>statement.

 They are type safe even then. But the default clause
in a select statement voids a guarrantee of *completeness*.

 I think this is like providing default return
values for virtual functions. It is not a good idea.

 (Default values for virtual functions guarrantees
that no derived class can possibly be a subclass: it
ensures the class invariant of the base is voided if
the virtual is overriden.)
>
>If Selects can only occur within "member functions" of a variant.
>And if Selects must have clauses for all types within the variant,
>then I think that Variants could be a safe and useful feature.

 Selects, and all *other* uses that expand to selects,
must have 'coverage' of all cases. There is a 'default'
to defeat this: call it the 'goto' of variants, or
the 'downcast' of variants if you like :-)

 Putting the selects in member functions misses
the point. Think of a variant as an inside out class:
it is the component *types* that are encapsulated: the
interface is already completely determined by those types.
Selects are just ordinary public accesses.

>Especially when using third party classes, or classes which are
>already closed.  It allows new abstractions involving unrelated
>classes to be created without invading the already existing class
>structures.  It is polymorphism from the outside in, or polymorphism
>"after the thought".

 Yes. That getting hot. Its classes all backwards :-)
And, there *is* late binding involved here, right?

 Even better for the analogy: classes have non-virtual
functions, and even global functions overload .. but there
is no loat binding in this overloading, right?

 Well same for variants:

 variant [int, long] x =1;

See? No late binding in this case: that 'x' is a real, honest
to goodness int. No cases are generated for 'long'.
The overloading resolution was done entirely at compile time.
Exactly as it is for:

 f(int);
 f(long);
 f(1);

>
>Now, however, let me try to express the concept of variants by using
>standard C++ inheritance.....
>
>Variant V [A,B]
>{
>  void f()
>  {
>    A: {this.X();}
>    B: {this.Y();}
>  }
>};
>
>------------------------------------------
>
>class ABVariant
>{
>  virtual void f() const = 0;
>};
>
>class ABVariant_A : public ABVariant
>{
>  public:
>    ABVariant_A(A* theA) : itsA(theA) {}
>    virtual void f() const {itsA->X();}
>  private:
>    A* itsA;
>};
>
>class ABVariant_B : public ABVariant
>{
>  public:
>    ABVariant_B(B* theB) : itsB(theB) {}
>    virtual void f() const {itsB->Y();}
>  private:
>    B* itsB;
>};
>
>So, it looks like we can implement variants, with the contraints that
>I mentioned above, by using regular inheritance.....    What have I
>missed?

You can manually implement variants with unions and tags too.

What you missed is that variants -- without the constraints --
are just mechanical code writing (like templates).

There is no magic in them, they just save REAMS AND REAMS of
code (see your example above :-)

Basically, the compiler dues type inference and argument
matching of a whole lot of cases for you automatically.
lets you get on with the job of writing application
oriented code, not fiddling with implementation techniques.

Here's the key question: whats the alternative?

Your technique above, and my advertised discriminated union
idiom, are not any answer in practice: too clumbsy. Agree?

So the answer is: downcasting and RTTI.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: decker@cs.sunysb.edu (David V. Ecker)
Date: 21 May 1993 03:38:06 GMT Raw View

I am looking for some UNIX, C and C++ books.
So I can become a wizard not just a stupid programmer. :-)

If you know of any.. Please e-mail...( decker@sbcs.sunysb.edu)

Thanks for your help,
David.

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Fri, 21 May 1993 14:54:31 GMT Raw View

In article <1993May20.160011.3965@rcmcon.com> rmartin@rcmcon.com (Robert Martin) writes:
>cok@acadia.Kodak.COM (David Cok) writes:
>>Given the cross product of a lot of functions and a lot of classes ...
>
>>if you want to add a new class
>> it is relatively easy using inheritance, since all the stuff for the
>>  new class is together
>> it is relatively hard if one is using variants, since all the variants
>>  must be found and new cases added
>
>>if you want to add a new function
>> it is relatively easy using variants, since all the code for that
>>  function is in one place
>> it is relatively hard if one is using inheritance, since one must find
>>  all the relevant classes and add a function
>
>This is an excellent summary of the issue.  This post, and the many
>other patient postings by net members have finally led me to a partial
>understanding.  I think I understand now what is motivating John et.
>al. to propose variants.

 Good, because there are a couple of thorny issues with variants,
and we need help with them : one issue is technical, and the other
political.

 The technical issue is: variants cant change type.
That appears to restrict their utility. Dunions can change type,
but they have other 'ugly' properties. (Variants are 'ideologically
pure' :-)

 Well, there appears to be a way to fix this by
encapsulating a variant in a class, thus forming an actual
type that is a discriminated union. A template for this
would make discriminated unions accessible to the public,
but you could always define *your* favourite form of
discriminated union if you wanted : its just another
class :-).

 The political issue, of course, is to convince
ourselves, and then the committee, that variants are
an essential language extension. :-) :-) :-)

(Yes, that statement rates at least three smileys )
>
>My question now becomes: "Can variants be implemented using standard
>C++ constructs."

 No. Individual varianst can be created and used idiomatically
as if they were real ones. That is, for any given code using
variants you can always write equivalent non-varianted C++ code.

 Thats exactly why variants *might* be accepted by
the committee: they are more or less a pre-processing job.

>i.e. is it true that the following two constructs
>are isomorphic, and is it true that they can both be extended without
>limit.
>
> variant v [A,B]; | Class ABVariant
>        void f(v)               | {
>        {                       |    public:
>           select(v)            |      virtual void f() = 0;
>           {                    | };
>             type (A) {DoA(v);} | class ABVariant_A : public ABVariant
>             type (B) {DoB(v);} | {
>           }                    |    public:
>        }                       |      ABVariant_A(A& a) : itsA(a) {}
>                                |      virtual void f() {DoA(itsA);}
>                                |    private:
>                                |      A& itsA;
>                                | };
>                                | class ABVariant_B : public ABVariant
>                                | {
>                                |   public:
>                                |     ABVariant_B(B& b) : itsB(b) {}
>                                |     virtual void f() {DoB(itsB);}
>                                |   private:
>                                |     B& itsB;
>                                | };

These look more or less equivalent, that is,

 f(ab); // variant case
 ab.f(); // class case

will do the same thing **in this case**. The difference is
how you perform the extensions: with variants, the extensions
are done by the compiler automatically, on the fly.

For example:

 class A; class B;
 int g(A){ ...}
 int g(B){ ...}

 int h(variant [A,B] x) {
  return g(x)+1;
 };

Notice I didnt bother with the select statement? There's one there
all the same: its compiler generated. But wait, there really
isnt a compiler generated select, its just function overloading.
Am I confused? No, there's no difference :-)

So what are variants? They a way of telling the compiler
to automatically generate all the cases for you: a sort
of compile time 'for loop' specified from the *inside out*.

Interestingly, your 'equivalent' to the select statement
demonstrates a use where there is *NO TYPE TAG*, or rather,
the virtual table acts as the type tag.

If you can imagine the code on the right being generated
automatically *for each use of a variant*, then yes,
they seem to be equivalent.

That is: the declaration

 variant AB [A,B];

doesnt create the class you describe: that happens each time
a select statement is encountered. Thus each use of
a variant is distinct from each other use.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Sat, 22 May 1993 12:28:35 GMT Raw View

rmartin@rcmcon.com (Robert Martin) writes:

>My question now becomes: "Can variants be implemented using standard
>C++ constructs."   i.e. is it true that the following two constructs
>are isomorphic, and is it true that they can both be extended without
>limit.
>
>        variant v [A,B];        | Class ABVariant
>        void f(v)               | {

I think that should be
  variant V [A, B];
  void f(V v)

>        {                       |    public:
>           select(v)            |      virtual void f() = 0;
>           {                    | };
>             type (A) {DoA(v);} | class ABVariant_A : public ABVariant
>             type (B) {DoB(v);} | {
>           }                    |    public:
>        }                       |      ABVariant_A(A& a) : itsA(a) {}
>                                |      virtual void f() {DoA(itsA);}
>                                |    private:
>                                |      A& itsA;
>                                | };
>                                | class ABVariant_B : public ABVariant
>                                | {
>                                |   public:
>                                |     ABVariant_B(B& b) : itsB(b) {}
>                                |     virtual void f() {DoB(itsB);}
>                                |   private:
>                                |     B& itsB;
>                                | };

I'm not quite sure exactly what isomorphism you are considering:
they don't have the same interface, I don't see the direct correspondance.

The example on the left is more simply written as
        variant V [A,B];
        void f(A v) {DoA(v);}
        void f(B v) {DoB(v);}

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: g2devi@cdf.toronto.edu (Deviasse Robert N.)
Date: Sun, 23 May 1993 00:03:21 GMT Raw View

> Sorry, I don't think I made myself clear: I didn't mean make any assignment
> for variants be reinitialization, I meant make *memberwise* assignment
> for variants be reinitialization. When is memberwise assignment used?
> Only to generate a default assignment operator, if there is no user-defined
> assignment operator for a type.
>
> In other words, I was suggesting that perhaps you could change the rule for
> compiler-generated assignment operators to use reinitialization
> for variant members, and member-wise assignment for non-variant members.
> The reinitialization would only apply to variants contained in structs
> or classes.
>
>


Hmmm, I don't really like the idea. Although it is convenient, it adds yet
another inconsistency to C++. In normal C++:
     struct { int x; } a={1},b={2L};
     int x=1,y=2L;
     x=y;
     a=b;
both x and a.x have the same value. However, with reinitializing assignments:
     variant INTEGER[int,long];
     struct { INTEGER x; } a={1},b={2L};
     INTEGER x=1,y=2L;
     x=y;
     a=b;
both x and a.x *do not* have the same value. Why is this important?
Well, the type safety of the select statement would be invalidated!
Consider:
     variant INTEGER[int,long];
     struct { INTEGER x; } a={1},b={2L};
     INTEGER &x=a.x;

     select(a.x){
        type(int& q) { a=b; /* Now q points to junk unless int==long!! */ }
        type(...) {}
     }


This problem *cannot* happen with the current definition of variants since
the type is always constant. This does mean that dunions have some safety
problems, but I'm willing to live with those since they are hacks/idioms and
idioms often have safety problems -- no problem as long as the problems occur
rarely and are documented. Variants, however, would be part of the language
definition (if accepted) and I wouldn't be willing to accept adding a language
feature that was inherently unsafe but appeared to be safe.


Take care
     Robert

--
/----------------------------------+------------------------------------------\
| Robert N. Deviasse               |"If we have to re-invent the wheel,       |
| EMAIL: g2devi@cdf.utoronto.ca    |  can we at least make it round this time"|
+----------------------------------+------------------------------------------/

Author: rmartin@uunet.uu.net!rcm (Robert Martin)
Date: Mon, 17 May 1993 16:30:17 GMT Raw View

fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:

>When I first read John Skaller's paper on variants last year, I was
>basically quite skeptical.  [...]  Now that I have experienced the
>problem first-hand, I'm convinced that it addressess a genuine need.

>[...] I think the some facility for discriminated unions is a
>fundamental feature that is missing from C++, and until it is fixed
>people will continue to write ugly, difficult-to-maintain, and
>sometimes even non-portable hacks to get around the problem.

I remain unconvinced.  As far as I can see, an inheritance hierarchy
IS a discrimnated union.  A pointer to a base class can point to any
of its derived classes.  The code which you would put into a select
statement can be placed in virtual functions.

The semantics are not identical obviously.  A variant can unify two
disparate types, whereas a pointer to a base class can only unify its
derivatives.  Still, I am not sure there is a big net gain when you
compare variants to inheritance hierarchies.

Also, I fear the concept of the select.  I don't like the notion that
users could be making decisions about my types without my knowledge.
I would rather manage my own types.

A worse fear has to do with extending variants.  How do I find all the
select statements when I want to add a new type to a variant?
--
Robert Martin      | Design Consulting       |Training courses offered:
R.C.M. Consulting  | rcm!rmartin@uunet.UU.NET| Object Oriented Analysis
2080 Cranbrook Rd. | Tel: (708) 918-1004     | Object Oriented Design
Green Oaks IL 60048| Fax: (708) 918-1023     | C++

Author: rmartin@uunet.uu.net!rcm (Robert Martin)
Date: Mon, 17 May 1993 16:34:14 GMT Raw View

maxtal@physics.su.OZ.AU (John Max Skaller) writes:

>In article <C6tswt.68t@world.std.com> tob@world.std.com (Tom O Breton) writes:
>>John (MAX):
>>
>>I am wondering where exactly the 'select()' statement would be useful.

> class bplus;
> class uminus;
> variant Node [bplus, uminus, int];
> struct uminus { Node* arg; };
> struct bplus { Node *left, *right; };

> int eval(Node* n){
>  select(n) {
>   type(int *x) { return *x; }
>   type(uminus *u) { return - eval(u->arg); }
>   type(bplus *p) {
>    return eval(u->left) + eval(u->right);
>   }
>  }
> }

This example can be implemented as an inheritance hierarchy in which
Node is the Base class and bplus, uminus and intNode are derivatives.
Node needs a pure virtual function named eval, and the derivatives
implement this function as you have shown.

So why is the variant needed?   Also, why would I want to separate the
'eval' code from the node type?  Why would I want to create an 'eval'
function which would have to be modified for every new type of node
that I created?  What if I forgot to modify it when I created a new node?

--
Robert Martin      | Design Consulting       |Training courses offered:
R.C.M. Consulting  | rcm!rmartin@uunet.UU.NET| Object Oriented Analysis
2080 Cranbrook Rd. | Tel: (708) 918-1004     | Object Oriented Design
Green Oaks IL 60048| Fax: (708) 918-1023     | C++

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Mon, 17 May 1993 21:56:30 GMT Raw View

rmartin@uunet.uu.net!rcm (Robert Martin) writes:

>fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>
>>When I first read John Skaller's paper on variants last year, I was
>>basically quite skeptical.  [...]  Now that I have experienced the
>>problem first-hand, I'm convinced that it addressess a genuine need.
>
>>[...] I think the some facility for discriminated unions is a
>>fundamental feature that is missing from C++, and until it is fixed
>>people will continue to write ugly, difficult-to-maintain, and
>>sometimes even non-portable hacks to get around the problem.
>
>I remain unconvinced.  As far as I can see, an inheritance hierarchy
>IS a discrimnated union.  A pointer to a base class can point to any
>of its derived classes.  The code which you would put into a select
>statement can be placed in virtual functions.

There is a big difference in the organization of the code, though.
I think that there are situations where you don't want to
have to add a new virtual function for every new operation,
you just want to do a type switch. If the code for a single
piece of functionality was scattered throughout 45 tiny little
virtual functions in 45 different files, maintenance would be
hell. There are times when what you want to do is to have all the
code for a particular function together in the one place.

>Also, I fear the concept of the select.  I don't like the notion that
>users could be making decisions about my types without my knowledge.
>I would rather manage my own types.

Presuming we do end up with some sort of variants or discriminated
unions in C++, then the users won't be making a decision as to whether
your type Base is actually a Derived1 or a Derived2, they will be
making decisions as to whether their "variant V [Derived1, Derived2, Base]"
is actually a derived1, derived2, or just a Base.
(Actually it's not necessary for the difference classes to share
a common base class, so it might just as easily be "variant V [T1, T2, T3]"
where T1, T2, and T3 are unrelated types.)
Since the type dependance is *explicit* in the code, maintenance is
reasonably straightforward.

On the other hand if we end up all using RTTI and downcasting,
then the type dependancies will all be hidden deep in a checked downcast
in the bowels of the implementation of some obscure part of the code :-\

>A worse fear has to do with extending variants.  How do I find all the
>select statements when I want to add a new type to a variant?

That one is easy: just add the new type to your variant declaration,
and recompile. The compiler will tell you exactly where all the select
statements are and if you are lucky will even position your cursor on
the exact line where you need to insert each bit of new code :-)

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Mon, 17 May 1993 22:23:46 GMT Raw View

rmartin@uunet.uu.net!rcm (Robert Martin) writes:

>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>
>>tob@world.std.com (Tom O Breton) writes:
>>>
>>>I am wondering where exactly the 'select()' statement would be useful.
>
>> variant Node [bplus, uminus, int];
>
>> int eval(Node* n){
>>  select(n) {
>>   type(int *x) { return *x; }
>>   type(uminus *u) { return - eval(u->arg); }
>>   type(bplus *p) {
>>    return eval(u->left) + eval(u->right);
>>   }
>>  }
>> }
>
>This example can be implemented as an inheritance hierarchy in which
>Node is the Base class and bplus, uminus and intNode are derivatives.
>Node needs a pure virtual function named eval, and the derivatives
>implement this function as you have shown.
>
>So why is the variant needed? Also, why would I want to separate the
>'eval' code from the node type?  Why would I want to create an 'eval'
>function which would have to be modified for every new type of node
>that I created?

In a real compiler, there would be many more operations on nodes
than just 'eval'. One operation might be 'convert to pointer',
another might be 'constant-propogate', and so on.
It might well be easier to document and modify these operations
on nodes if all the code for a particular operation is in the
same place. You might want to avoid having to add a new virtual
function every time you add some new code to produce additional
warnings in the type analysis phase.

>What if I forgot to modify it when I created a new node?

Then the compiler would give you a warning.

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: alanb@sdl.ug.eds.com (Alan Braggins)
Date: 18 May 93 10:50:01 GMT Raw View

>>>>> On Mon, 17 May 1993 21:56:30 GMT, fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) said:

>>I remain unconvinced.  As far as I can see, an inheritance hierarchy
>>IS a discrimnated union.  A pointer to a base class can point to any
>>of its derived classes.  The code which you would put into a select
>>statement can be placed in virtual functions.

> There is a big difference in the organization of the code, though.
> I think that there are situations where you don't want to
> have to add a new virtual function for every new operation,
> you just want to do a type switch. If the code for a single
> piece of functionality was scattered throughout 45 tiny little
> virtual functions in 45 different files, maintenance would be
> hell. There are times when what you want to do is to have all the
> code for a particular function together in the one place.

So put them all in one file then. It is often convenient to have the
code divided between files on a per class basis, but not compulsary.
You do have to remember to add the new function to the existing file
when you add a new class, but the same would be true with variants.

If you don't have access to the source, you can't add another switch
to the variant at all - with inheritance you split between supplied
classes and ones you have derived yourself.

Of course ideally your browser/development environment hides all
this from you anyway...
--
Alan Braggins  alanb@sdl.ug.eds.com +44-223-371608
Shape Data (a division of EDS-Scicon), 46 Regent St, Cambridge, CB2 1DB, U.K.
    Any technology distinguishable from magic is insufficiently advanced.
Why do people who say "That's just semantics" so rarely know what it means?

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Tue, 18 May 1993 19:26:36 GMT Raw View

In article <1993May17.163017.7055@uunet.uu.net!rcm> rmartin@uunet.uu.net!rcm (Robert Martin) writes:
>fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>
>>When I first read John Skaller's paper on variants last year, I was
>>basically quite skeptical.  [...]  Now that I have experienced the
>>problem first-hand, I'm convinced that it addressess a genuine need.
>
>>[...] I think the some facility for discriminated unions is a
>>fundamental feature that is missing from C++, and until it is fixed
>>people will continue to write ugly, difficult-to-maintain, and
>>sometimes even non-portable hacks to get around the problem.
>
>I remain unconvinced.  As far as I can see, an inheritance hierarchy
>IS a discrimnated union.  A pointer to a base class can point to any
>of its derived classes.  The code which you would put into a select
>statement can be placed in virtual functions.

 An inheritance heirarchy is invasive: you cant take pre-existing
types and create a new base for them as you please. You can do that
for a union..you can make a new union and put any types into it.
(Variants represent sets of types, similar to a union in some ways)

>
>The semantics are not identical obviously.

 Right. On the contrary, they are completely disparate.
Almost every criterion you can think of is treated opposite
by the two facilities. Gee, a variant isnt even a *type*!

>A variant can unify two
>disparate types, whereas a pointer to a base class can only unify its
>derivatives.

 Yes.

>Still, I am not sure there is a big net gain when you
>compare variants to inheritance hierarchies.

 Variants are safe. And, as far as I can tell,
conceptually correct. They represent selection/alternation/unification.
Inheritance hierarchies dont represent unification, since the
classes start out unified: they represent the opposite:
diversification.

 The 'decision' time is the opposite, if you like.
>
>Also, I fear the concept of the select.  I don't like the notion that
>users could be making decisions about my types without my knowledge.
>I would rather manage my own types.

 Dont understand. The users cant do anything to a type
in a variant they cant *already* do to the type. Variants
do not enable acccess violations, they are 100% compatible with
the existing access restrictions and are 100% safe (as far as
anything is safe in C++ .. you can still get dangling pointers :-)

>
>A worse fear has to do with extending variants.  How do I find all the
>select statements when I want to add a new type to a variant?

 Compiler the program. You'll get an error every time.
Variants are completely statically type safe. If you dont get
an error, the types were not distinct: you've abused variants.
(You can abuse inheritance..it would not surprise me if there
were abuses of variants too :-)

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Tue, 18 May 1993 19:30:19 GMT Raw View

In article <1993May17.163414.7144@uunet.uu.net!rcm> rmartin@uunet.uu.net!rcm (Robert Martin) writes:
>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>
>>In article <C6tswt.68t@world.std.com> tob@world.std.com (Tom O Breton) writes:
>>>John (MAX):
>>>
>>>I am wondering where exactly the 'select()' statement would be useful.
>
>> class bplus;
>> class uminus;
>> variant Node [bplus, uminus, int];
>> struct uminus { Node* arg; };
>> struct bplus { Node *left, *right; };
>
>> int eval(Node* n){
>>  select(n) {
>>   type(int *x) { return *x; }
>>   type(uminus *u) { return - eval(u->arg); }
>>   type(bplus *p) {
>>    return eval(u->left) + eval(u->right);
>>   }
>>  }
>> }
>
>This example can be implemented as an inheritance hierarchy in which
>Node is the Base class and bplus, uminus and intNode are derivatives.
>Node needs a pure virtual function named eval, and the derivatives
>implement this function as you have shown.
>
>So why is the variant needed?

 Write postfix and infix and prefix output as well.

>Also, why would I want to separate the
>'eval' code from the node type?

 Your job is to write a parser to build a syntax tree,
mine to generate code or whatever from the tree. You want to
do my job too? By your method, the classes could not be closed
until both jobs were done.


>Why would I want to create an 'eval'
>function which would have to be modified for every new type of node
>that I created?  What if I forgot to modify it when I created a new node?
>

 The program would not compile.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: rmartin@rcmcon.com (Robert Martin)
Date: Tue, 18 May 1993 15:26:41 GMT Raw View

fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:

>rmartin@uunet.uu.net!rcm (Robert Martin) writes:

>>I remain unconvinced.  As far as I can see, an inheritance hierarchy
>>IS a discrimnated union.  A pointer to a base class can point to any
>>of its derived classes.  The code which you would put into a select
>>statement can be placed in virtual functions.

>I think that there are situations where you don't want to
>have to add a new virtual function for every new operation,
>you just want to do a type switch. If the code for a single
>piece of functionality was scattered throughout 45 tiny little
>virtual functions in 45 different files, maintenance would be
>hell.

I understand what you are saying here.  I don't quite agree, but I
understand. Changing the 45 functions in each of the derived classes
is a pain.  Worse, I must hunt for those derived classes, since the
compiler wont' necessarily tell me where they are.

However, in order for a problem like this to occurr, we must be
changing something fundemental at the base level, so we are "opening"
a "closed" base.  We are making a very fundemental change to the
architecture of our application.  And I expect pain to be the result
of such fundemental changes.

What worries me about variants is that adding a new type, something
that should be an "open" process, will cause me to alter "closed" code
(the selects and the variants).   Thus, what ought to be a trivial
extention becomes a major undertaking.  Even though nothing
fundemental changed, I must open working code and add cases for new
variant types.

>On the other hand if we end up all using RTTI and downcasting,
>then the type dependancies will all be hidden deep in a checked downcast
>in the bowels of the implementation of some obscure part of the code :-\

Right, and I fear this just as much.  I am not an advocate of general
use of RTTI.

>>A worse fear has to do with extending variants.  How do I find all the
>>select statements when I want to add a new type to a variant?

>That one is easy: just add the new type to your variant declaration,
>and recompile. The compiler will tell you exactly where all the select
>statements are and if you are lucky will even position your cursor on
>the exact line where you need to insert each bit of new code :-)

This presumes of course that a select statement must have a clause for
each of the types in its variant.  If a "default" or "ignore" clause
is introduced, then we will not have solved this problem.

More fundemental however is the problem of how you find the variant
declarations.  Having added a new type to the system, which variants
does it belong in?  And where are those variant declarations located?

-----------------------------------------

Aside:  I am playing devil's advocate, not deriding what may be a
valid proposal.  I am expressing my true concerns, but am open to
seeing those concerns addressed.

--
Robert Martin       | Design Consulting   | Training courses offered:
R.C.M. Consulting   | rmartin@rcmcon.com  |   Object Oriented Analysis
2080 Cranbrook Rd.  | Tel: (708) 918-1004 |   Object Oriented Design
Green Oaks IL 60048 | Fax: (708) 918-1023 |   C++

Author: cok@acadia.Kodak.COM (David Cok)
Date: Wed, 19 May 93 01:38:40 GMT Raw View

In article <1993May18.152641.2061@rcmcon.com> rmartin@rcmcon.com (Robert Martin) writes:
>
>What worries me about variants is that adding a new type, something
>that should be an "open" process, will cause me to alter "closed" code
>(the selects and the variants).   Thus, what ought to be a trivial
>extention becomes a major undertaking.  Even though nothing
>fundemental changed, I must open working code and add cases for new
>variant types.
>

Given the cross product of a lot of functions and a lot of classes ...

if you want to add a new class
 it is relatively easy using inheritance, since all the stuff for the
  new class is together
 it is relatively hard if one is using variants, since all the variants
  must be found and new cases added

if you want to add a new function
 it is relatively easy using variants, since all the code for that
  function is in one place
 it is relatively hard if one is using inheritance, since one must find
  all the relevant classes and add a function

It seems clear to me that the choice of inheritance or variants (if one has
the choice) is a design decision relating in part to what the designer sees
as more stable, the set of functions or the set of classes.

Whether adding a new class (with all the functionality) is trivial or a
major revision or whether adding a new function (for all classes) is trivial
or a major revision, will depend on the problem at hand.  Now if we could
only come up with programming language constructs so that both were trivial...

David Cok

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Wed, 19 May 1993 10:05:55 GMT Raw View

In article <1993May18.152641.2061@rcmcon.com> rmartin@rcmcon.com (Robert Martin) writes:
>
>>I think that there are situations where you don't want to
>>have to add a new virtual function for every new operation,
>>you just want to do a type switch. If the code for a single
>>piece of functionality was scattered throughout 45 tiny little
>>virtual functions in 45 different files, maintenance would be
>>hell.
>
>I understand what you are saying here.  I don't quite agree, but I
>understand. Changing the 45 functions in each of the derived classes
>is a pain.  Worse, I must hunt for those derived classes, since the
>compiler wont' necessarily tell me where they are.

 I try to think of variants as 'anti-classes'.
As such, they have almost opposite properties of classes.
For example, classes can be closed but remain open.
That is the basis of polymorphism, right?

 But now people have this problem they are trying
to use polymorphism and inheritance to do things in C++
that just cant be done properly.

 I *share* your view that good object oriented
design will often resolve this problem, but there remain
cases where we appear to need type selection.

 The reason for introducing variants is to
acknowledge that these cases are real and demand a solution
*other* than the mutually abhored use of RTTI for downcasting.
There are *legitimate* uses of that too, in fact there are
three categories:

 1) Subclassing (infinite derivative implementations,
  single interface)
 2) Selection (finite distinct interfaces)
 3) Dynamic typing (arbitrary distinct interfaces)

(1) can do now.
(2) is what variants are for
(3) requires RTTI and is least prefered (but sometimes necessary)

Without a specific language extension for (2) we will be forced
to use the inappropriate and unsafe (3) where a redesign
to (1) is not proper.

>
>However, in order for a problem like this to occurr, we must be
>changing something fundemental at the base level, so we are "opening"
>a "closed" base.  We are making a very fundemental change to the
>architecture of our application.  And I expect pain to be the result
>of such fundemental changes.

 Variants are not bases, so the issue of opening them
doesnt arise in the same context.

>
>What worries me about variants is that adding a new type, something
>that should be an "open" process, will cause me to alter "closed" code
>(the selects and the variants).   Thus, what ought to be a trivial
>extention becomes a major undertaking.  Even though nothing
>fundemental changed, I must open working code and add cases for new
>variant types.

 Yes, but variants are designed for when this *is* the
circumstance.

 Unlike RTTI/downcasting, at least the compiler will
ensure that you get it right.

>
>>On the other hand if we end up all using RTTI and downcasting,
>>then the type dependancies will all be hidden deep in a checked downcast
>>in the bowels of the implementation of some obscure part of the code :-\
>
>Right, and I fear this just as much.  I am not an advocate of general
>use of RTTI.
>
>>>A worse fear has to do with extending variants.  How do I find all the
>>>select statements when I want to add a new type to a variant?
>
>>That one is easy: just add the new type to your variant declaration,
>>and recompile. The compiler will tell you exactly where all the select
>>statements are and if you are lucky will even position your cursor on
>>the exact line where you need to insert each bit of new code :-)
>
>This presumes of course that a select statement must have a clause for
>each of the types in its variant.  If a "default" or "ignore" clause
>is introduced, then we will not have solved this problem.
>
>More fundemental however is the problem of how you find the variant
>declarations.  Having added a new type to the system, which variants
>does it belong in?  And where are those variant declarations located?
>
>-----------------------------------------
>
>Aside:  I am playing devil's advocate, not deriding what may be a
>valid proposal.  I am expressing my true concerns, but am open to
>seeing those concerns addressed.
>
>--
>Robert Martin       | Design Consulting   | Training courses offered:
>R.C.M. Consulting   | rmartin@rcmcon.com  |   Object Oriented Analysis
>2080 Cranbrook Rd.  | Tel: (708) 918-1004 |   Object Oriented Design
>Green Oaks IL 60048 | Fax: (708) 918-1023 |   C++


--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Wed, 19 May 1993 10:39:22 GMT Raw View

In article <1993May19.013840.7910@kodak.kodak.com> cok@acadia.Kodak.COM (David Cok) writes:
>
>Given the cross product of a lot of functions and a lot of classes ...
>
>if you want to add a new class
> it is relatively easy using inheritance, since all the stuff for the
>  new class is together
> it is relatively hard if one is using variants, since all the variants
>  must be found and new cases added
>
>if you want to add a new function
> it is relatively easy using variants, since all the code for that
>  function is in one place
> it is relatively hard if one is using inheritance, since one must find
>  all the relevant classes and add a function
>

 Nice!

 Variants are opposite to classes, in so many respects,
one might almost have thought I deliberately designed them that way :-)

>
>It seems clear to me that the choice of inheritance or variants (if one has
>the choice) is a design decision relating in part to what the designer sees
>as more stable, the set of functions or the set of classes.

 Yes. I'd like to be able to mix the two as I think appropriate.

>
>Whether adding a new class (with all the functionality) is trivial or a
>major revision or whether adding a new function (for all classes) is trivial
>or a major revision, will depend on the problem at hand.  Now if we could
>only come up with programming language constructs so that both were trivial...
>

 Well, we already have classes, right?

 And I *do* have a proposal for variants that is quite
specific and relatively easy to use, understand and implement.

 The syntax is:

 variant Num [long, int];

 select(num)
 {
  type(long l) { ... }
  type(int i) { ... }
 }

where the 'type' clauses 'inline' functions and the selection
is made by ordinary overload resolution.

 Its easy to use variants: just use tham where its obvious
you need them :-)

 Note: a variant is *not* a type, so what is it?

 Well, its the declaration of a thing for which
you aren't sure of the type at compile time.

 What happens is that the compiler generates
code for *all* the possible types, then at run time,
it just jumps to the appropriate code.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: rmartin@rcmcon.com (Robert Martin)
Date: Wed, 19 May 1993 17:19:39 GMT Raw View

maxtal@physics.su.OZ.AU (John Max Skaller) writes:

>In article <1993May17.163017.7055@uunet.uu.net!rcm> rmartin@uunet.uu.net!rcm (Robert Martin) writes:
>>I remain unconvinced.  As far as I can see, an inheritance hierarchy
>>IS a discrimnated union.  A pointer to a base class can point to any
>>of its derived classes.  The code which you would put into a select
>>statement can be placed in virtual functions.

> An inheritance heirarchy is invasive: you cant take pre-existing
>types and create a new base for them as you please. You can do that
>for a union..you can make a new union and put any types into it.
>(Variants represent sets of types, similar to a union in some ways)

Aha!  I begin to see what your viewpoint is.  A variant is a way of
coallescing a set of types under a single "name".  It's as if you
could give all those types a common base class, except that you don't
supply any interfaces for them.

Lets take one step forward on this.  Lets say that you COULD supply
interfaces for variants.

Variant X [A, B, C]
{
   void f()
   {
       Select
       {
         A: {...}
         B: {...}
         C: {...}
       }
   }
};

This allows you to specify "member functions" or interfaces for
variants.  I would strongly recommend that Select statements can only
be used within the "member functions" of Variants.  This localizes all
the code that deals with the variant to a single place, and mollifies
my fears about hunting down Selects which are scattered through the
code.

>The users cant do anything to a type
>in a variant they cant *already* do to the type. Variants
>do not enable acccess violations, they are 100% compatible with
>the existing access restrictions and are 100% safe (as far as
>anything is safe in C++ .. you can still get dangling pointers :-)

Yes, I understand this now.  Variants are type safe as long as there
are no "default" or "ignore" clauses in Select statements.  I.e. every
type within the variant must be expressed within every Select
statement.

If Selects can only occur within "member functions" of a variant.
And if Selects must have clauses for all types within the variant,
then I think that Variants could be a safe and useful feature.
Especially when using third party classes, or classes which are
already closed.  It allows new abstractions involving unrelated
classes to be created without invading the already existing class
structures.  It is polymorphism from the outside in, or polymorphism
"after the thought".

Now, however, let me try to express the concept of variants by using
standard C++ inheritance.....

Variant V [A,B]
{
  void f()
  {
    A: {this.X();}
    B: {this.Y();}
  }
};

------------------------------------------

class ABVariant
{
  virtual void f() const = 0;
};

class ABVariant_A : public ABVariant
{
  public:
    ABVariant_A(A* theA) : itsA(theA) {}
    virtual void f() const {itsA->X();}
  private:
    A* itsA;
};

class ABVariant_B : public ABVariant
{
  public:
    ABVariant_B(B* theB) : itsB(theB) {}
    virtual void f() const {itsB->Y();}
  private:
    B* itsB;
};

So, it looks like we can implement variants, with the contraints that
I mentioned above, by using regular inheritance.....    What have I
missed?
--
Robert Martin       | Design Consulting   | Training courses offered:
R.C.M. Consulting   | rmartin@rcmcon.com  |   Object Oriented Analysis
2080 Cranbrook Rd.  | Tel: (708) 918-1004 |   Object Oriented Design
Green Oaks IL 60048 | Fax: (708) 918-1023 |   C++

Author: rmartin@rcmcon.com (Robert Martin)
Date: Wed, 19 May 1993 17:22:26 GMT Raw View

maxtal@physics.su.OZ.AU (John Max Skaller) writes:

>>>In article <C6tswt.68t@world.std.com> tob@world.std.com (Tom O Breton) writes:

>>Also, why would I want to separate the
>>'eval' code from the node type?

> Your job is to write a parser to build a syntax tree,
>mine to generate code or whatever from the tree. You want to
>do my job too? By your method, the classes could not be closed
>until both jobs were done.

See the related article in c.l.c++ having to do with NodeProcessor
objects which act as agents for the Nodes.  The behaviors like 'eval'
or 'postfix' can be supplied by these agent classes.

--
Robert Martin       | Design Consulting   | Training courses offered:
R.C.M. Consulting   | rmartin@rcmcon.com  |   Object Oriented Analysis
2080 Cranbrook Rd.  | Tel: (708) 918-1004 |   Object Oriented Design
Green Oaks IL 60048 | Fax: (708) 918-1023 |   C++

Author: gregw@minotaur.tansu.com.au (Greg Wilkins)
Date: 20 May 1993 00:08:10 GMT Raw View

In article 4312@ucc.su.OZ.AU, maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>In article <1993May19.013840.7910@kodak.kodak.com> cok@acadia.Kodak.COM (David Cok) writes:
>>
>>Given the cross product of a lot of functions and a lot of classes ...
>>
>>if you want to add a new class
>> it is relatively easy using inheritance, since all the stuff for the
>>  new class is together
>> it is relatively hard if one is using variants, since all the variants
>>  must be found and new cases added
>>
>>if you want to add a new function
>> it is relatively easy using variants, since all the code for that
>>  function is in one place
>> it is relatively hard if one is using inheritance, since one must find
>>  all the relevant classes and add a function
>>
>
> Nice!

Yes, this is a nice way of putting the arguement for variants. Together with the
"union is to struct as variant is to class" line, I'm starting to get interested.


> And I *do* have a proposal for variants that is quite
>specific and relatively easy to use, understand and implement.
>

How much detail have you put into your proposal?  How do you propose to handle the
following cases?


class MyClassWithoutCastToIntOp;
variant V1 [long, int];
variant V2 [int, char, MyClassWithoutCastToIntOp];
variant V3 [int, char];

void func1()
{
  V1 v1;
  V2 v2;
  ...
  v1 = v2;       // is this legal??, does it produce a run time error if v2 is not
                 // an int?, if v2 is a char, does it promote it to an int?
  v1 = (V1)v2;   // if not, does this help??
  int i=v2;      // Does this produce a run time error for MyClassWithoutCastToIntOp
  i=(V3)v2;      // if the last line is illegal, does this help?
}

Bool operator==(V1& v1, const V2& v2)
{
    // how do I write the body of this operator?
    // nested selects?
    select(v1)
    {
        type(long)
           select(v2)
           {
              type(int) {...}
              type(char) {...}
              type(MyClassWithoutCastToIntOp) {...}
           }
        type(int)
         ...
     }

     // or multi selects?
     select(v1,v2)
     {
         type(int,MyClassWithoutCastToIntOp) {return v1==0;}
         // do I need to list the trivial type() cases or will the compiler fill
         // them in?

         type(... , MyClassWithoutCastToIntOp) {return FALSE;}
         // can I catch dont care types?
     }
}


void func2()
{
     V2 v2;
     cout << v2;   // I can see how this one would work.
     cin >> v2;    // but it would be really nice if you could solve this one
                   // in a general fashion.
}




-gregw

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Thu, 20 May 1993 04:57:48 GMT Raw View

rmartin@rcmcon.com (Robert Martin) writes:

>fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>
>>I think that there are situations where you don't want to
>>have to add a new virtual function for every new operation,
>>you just want to do a type switch. If the code for a single
>>piece of functionality was scattered throughout 45 tiny little
>>virtual functions in 45 different files, maintenance would be
>>hell.
>
>I understand what you are saying here.  I don't quite agree, but I
>understand. Changing the 45 functions in each of the derived classes
>is a pain.  Worse, I must hunt for those derived classes, since the
>compiler wont' necessarily tell me where they are.

Well, the compiler *will* tell you where they are if the new function
you add is a *pure* virtual function. Of course this only works if
you were adding to an abstract base class - another good reason for
using abstract base clasees :-)

>However, in order for a problem like this to occurr, we must be
>changing something fundemental at the base level, so we are "opening"
>a "closed" base.  We are making a very fundemental change to the
>architecture of our application.  And I expect pain to be the result
>of such fundemental changes.
>
>What worries me about variants is that adding a new type, something
>that should be an "open" process, will cause me to alter "closed" code
>(the selects and the variants).   Thus, what ought to be a trivial
>extention becomes a major undertaking.  Even though nothing
>fundemental changed, I must open working code and add cases for new
>variant types.

But who is to say that adding a new *type* should be an "open" process,
or that adding a new *type* should be a trivial extension?
I would say that it would obviously depend on the application whether
or not adding a new type was a trivial extension.
In some applications adding a new *function* should be an "open" process
and a trivial extension, not adding a new type.

Thus, to continue the analogy/parody:

 What worries me about inheritance is that adding a new
 function, something that should be an "open" process, will
 cause me to alter "closed" code (the derived classes and the
 base class). Thus, what ought to be a trivial extention becomes
 a major undertaking.  Even though nothing fundemental changed,
 I must open working code and add code for the new virtual
 functions.  [[ :-) ]]

>>>A worse fear has to do with extending variants.  How do I find all the
>>>select statements when I want to add a new type to a variant?
>
>>That one is easy: just add the new type to your variant declaration,
>>and recompile. The compiler will tell you exactly where all the select
>>statements are and if you are lucky will even position your cursor on
>>the exact line where you need to insert each bit of new code :-)
>
>This presumes of course that a select statement must have a clause for
>each of the types in its variant.  If a "default" or "ignore" clause
>is introduced, then we will not have solved this problem.

It is true that if your type-switch has a default clause, then you would
not get a warning.
If you use the default clause, then it means that the code must
be appropriate for *all* other types. You have explicitly prevented
the compiler from warning you if another type is added. This should
not be done lightly, but it is appropriate in some cases.

For example, suppose you are writing an interpreter for a dynamically
typed language, where variables may be either strings, numbers or lists.
Here's some example code to "add" two variables together, which
would be executed whenever you need to interpret an addition operator.
(This is an example of using variants for multiple dispatch).
If the types don't match, then you might for example raise an
exception.

 variant Mixed [String, int, double, List];
 variant Numeric [int, double];

 Mixed operator+ (Mixed x, Mixed y) {
   select(x, y) {
     type(String xs, String ys)  { return concatenate(xs,ys); }
     type(Numeric xi, Numeric yi) { return xi + yi; }
     type(List xl, List yl)  { return append(xl,yl); }
     type(..., ...)   { raise TypeError; }
   }
 }

Now suppose that you add a new type, say Object.
By default, adding two Objects will be an error.
The compiler won't warn you that you might need to modify
operator+(Mixed,Mixed), because you explicitly used a
default clause (...) in your type selection. The program
will still work fine, but it might not do what you
want it to: maybe you *did* want to define operator+(Object,Object).

You *can* ensure that the compiler warns you in this example by coding
it slightly differently:

 Mixed operator+ (Mixed x, Mixed y) {
   select(x, y) {
     type(String xs, String ys)  { return concatenate(xs,ys); }
     type(Numeric xi, Numeric yi) { return xi + yi; }
     type(List xl, List yl)  { return append(xl,yl); }
     type(variant [String, List, Numeric], ...) { raise TypeError; }
   }
 }

It's true however that if there is a default clause allowed, then
this is no longer a *guarantee* that the compiler will locate
every type selection statement for you when you add a new type to
a variant, by signalling a warning or error at those type selections.
This lack of a guarantee is a legitimate criticism of variants.
The program would still work, but it might not do what you had intended.

>More fundemental however is the problem of how you find the variant
>declarations.  Having added a new type to the system, which variants
>does it belong in?  And where are those variant declarations located?

Well, presumably you added the new type to the system for a purpose,
to increase the functionality of some part of the system. So you
add the new type to the appropriate variant declaration in that
part of the system. If that implies that you need to add the type
to some other variant declaration, then the compiler will give you
an error message.

Example:
 Variant V1 [X, Y, Z];
 Variant V2 [X, Y, Z, A, B, C];
Suppose you add X2 to V1.
Then if somewhere there is an call
 void f(V2);
 V1 v1;
 f(v1)
then the compiler will give you an error message about there being no
function f(X2). So you will know that you have to add X2 to V2.

>Aside:  I am playing devil's advocate, not deriding what may be a
>valid proposal.  I am expressing my true concerns, but am open to
>seeing those concerns addressed.

The more devil's advocates, the merrier! :-)
That's exactly why we're discussing these sort of things on comp.std.c++.

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Thu, 20 May 1993 08:45:20 GMT Raw View

gregw@minotaur.tansu.com.au (Greg Wilkins) writes:
>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>> Nice!
>Yes, this is a nice way of putting the arguement for variants. Together
>with the "union is to struct as variant is to class" line, I'm starting
>to get interested.

Actually, I think it sounds better if you say "variant is to union what
class is to struct" :-)

>> And I *do* have a proposal for variants that is quite
>>specific and relatively easy to use, understand and implement.
>
>How much detail have you put into your proposal?  How do you propose to
>handle the following cases?

Well, the paper I have read was roughly 30 pages if I remember rightly.
Let's see if I get all these right.

>class MyClassWithoutCastToIntOp;
>variant V1 [long, int];
>variant V2 [int, char, MyClassWithoutCastToIntOp];
>variant V3 [int, char];

[I moved the trickiest example (the one with assignment) to the end.
 Let's deal with the easy cases first :-) ]

>Bool operator==(V1& v1, const V2& v2)
>{
>    // how do I write the body of this operator?
>    // nested selects?
>    select(v1)
>    {
>        type(long)
>           select(v2)
>           {
>              type(int) {...}
>              type(char) {...}
>              type(MyClassWithoutCastToIntOp) {...}
>           }
>        type(int)
>         ...
>     }

Yes, that would work. You would need to write
       type(int i) { ... i ... }
instead of just
       type(int) { ... }
if you want to actually examine the value.
If you want to modify the value or avoid a copy, you need to use
a reference:
       type(int &i)

>     // or multi selects?

Yes, multi-selects are in the proposal.

>     select(v1,v2)
>     {
>         type(int,MyClassWithoutCastToIntOp) {return v1==0;}
>         // do I need to list the trivial type() cases or will the compiler
>         // fill them in?

I think you need to list them.

>         type(... , MyClassWithoutCastToIntOp) {return FALSE;}
>         // can I catch dont care types?
>     }
>}

Yes, you can catch don't care types using "...".

I'm not sure whether the proposal says anything explicit about using "..."
in *multi-selects*, but it would make sense, so it should be made explicit
if it's not already so.

>void func2()
>{
>     V2 v2;
>     cout << v2;   // I can see how this one would work.
>     cin >> v2;    // but it would be really nice if you could solve this one
>                   // in a general fashion.
>}

Both statements work fine. Any statement containing a variant expands to
an implicit select statement over that variant. So func2() is equivalent to:

 void func2() {
      V2 v2;
      select (v2) {
        type (int &v2_i) { cout << v2_i; }
        type (char &v2_c) { cout << v2_c; }
        type (MyClassWithoutCastToIntOp &v2_m) { cout << v2_m; }
      };
      select (v2) {
        type (int &v2_i) { cin >> v2_i; }
        type (char &v2_c) { cin >> v2_c; }
        type (MyClassWithoutCastToIntOp &v2_m) { cin >> v2_m; }
      };
 }

OK, now for the tricky example: assignment.

>variant V1 [long, int];
>variant V2 [int, char, MyClassWithoutCastToIntOp];
>variant V3 [int, char];
>
>void func1()
>{
>  V1 v1;
>  V2 v2;
>  ...
>  v1 = v2;       // is this legal??, does it produce a run time error if v2
>                 // is not an int?, if v2 is a char, does it promote it to
>    // an int?

This is not legal.
It gets expanded to a typeswitch
 select(v1,v2) {
    type(long &v1r, int &v2r) { v1r = v2r; }
    type(long &v1r, char &v2r) { v1r = v2r; }
    type(long &v1r, MyClass &v2r) { v1r = v2r; }  // illegal
    type(int &v1r, int &v2r) { v1r = v2r; }
    type(int &v1r, char &v2r) { v1r = v2r; }
    type(int &v1r, MyClass &v2r) { v1r = v2r; }  // illegal
 }
and is illegal since not all of the cases are legal.

>  v1 = (V1)v2;   // if not, does this help??

No. Variants are sets of types, not types, so you can't cast to a variant.

>  int i=v2;      // Does this produce a run time error for
>    // MyClassWithoutCastToIntOp

No, it produces a compile-time error.

>  i=(V3)v2;      // if the last line is illegal, does this help?

No, you can't cast to a variant.

Essentially, variants don't support assignment. A variants type tag
is set when it is initialized and it can't be modified afterwards.
Variants are a bit like references in this respect.

BUT you can use a variant to *implement* a discriminated union class
that *does* support assignment :-)
More on this later...

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: rmartin@rcmcon.com (Robert Martin)
Date: Thu, 20 May 1993 16:00:11 GMT Raw View

cok@acadia.Kodak.COM (David Cok) writes:
>Given the cross product of a lot of functions and a lot of classes ...

>if you want to add a new class
> it is relatively easy using inheritance, since all the stuff for the
>  new class is together
> it is relatively hard if one is using variants, since all the variants
>  must be found and new cases added

>if you want to add a new function
> it is relatively easy using variants, since all the code for that
>  function is in one place
> it is relatively hard if one is using inheritance, since one must find
>  all the relevant classes and add a function

This is an excellent summary of the issue.  This post, and the many
other patient postings by net members have finally led me to a partial
understanding.  I think I understand now what is motivating John et.
al. to propose variants.

My question now becomes: "Can variants be implemented using standard
C++ constructs."   i.e. is it true that the following two constructs
are isomorphic, and is it true that they can both be extended without
limit.

 variant v [A,B]; | Class ABVariant
        void f(v)               | {
        {                       |    public:
           select(v)            |      virtual void f() = 0;
           {                    | };
             type (A) {DoA(v);} | class ABVariant_A : public ABVariant
             type (B) {DoB(v);} | {
           }                    |    public:
        }                       |      ABVariant_A(A& a) : itsA(a) {}
                                |      virtual void f() {DoA(itsA);}
                                |    private:
                                |      A& itsA;
                                | };
                                | class ABVariant_B : public ABVariant
                                | {
                                |   public:
                                |     ABVariant_B(B& b) : itsB(b) {}
                                |     virtual void f() {DoB(itsB);}
                                |   private:
                                |     B& itsB;
                                | };


--
Robert Martin       | Design Consulting   | Training courses offered:
R.C.M. Consulting   | rmartin@rcmcon.com  |   Object Oriented Analysis
2080 Cranbrook Rd.  | Tel: (708) 918-1004 |   Object Oriented Design
Green Oaks IL 60048 | Fax: (708) 918-1023 |   C++

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Thu, 20 May 1993 19:59:06 GMT Raw View

In article <1tei5a$lg9@picasso.cssc-syd.tansu.com.au> gregw@minotaur.tansu.com.au writes:
>
>> And I *do* have a proposal for variants that is quite
>>specific and relatively easy to use, understand and implement.
>>
>
>How much detail have you put into your proposal?

 Probably too much: the fact is the proposal is quite
simple, its the consequences that are far reaching.

>How do you propose to handle the
>following cases?
>
>
>class MyClassWithoutCastToIntOp;
>variant V1 [long, int];
>variant V2 [int, char, MyClassWithoutCastToIntOp];
>variant V3 [int, char];
>
>void func1()
>{
>  V1 v1;
>  V2 v2;

 Note technically illegal: variants must be initialised.
 Like references. (Because, in some sense, they are)

>  ...
>  v1 = v2;       // is this legal??, does it produce a run time error if v2 is not

 No, its illegal normally because there is no conversion from
MyClassWithoutCastToIntOp (Whew) to either int or long.

 With variants, *all* cases must be legal: its statically checked.

>                 // an int?, if v2 is a char, does it promote it to an int?

 A char promotes to an int.

 If v2 is a char and v2 is an int, the assignment of *that* case
is OK.

>  v1 = (V1)v2;   // if not, does this help??

 No. The Class with the long name wont convert.

>  int i=v2;      // Does this produce a run time error for MyClassWithoutCastToIntOp

 No. There are no run-time errors, its a compile time error.
 You must handle all cases somehow.

>  i=(V3)v2;      // if the last line is illegal, does this help?

 No. you want to fix it?

 select (v2)
 {
  type(V1 v) { v1 = v; }
  type(...) { cout<<"Cant convert class with long name"; }
 }

>}
>
>Bool operator==(V1& v1, const V2& v2)
>{
>    // how do I write the body of this operator?
>    // nested selects?

 If you like.
>
>     // or multi selects?

 If you like.

>     select(v1,v2)
>     {
>         type(int,MyClassWithoutCastToIntOp) {return v1==0;}
>         // do I need to list the trivial type()
> // cases or will the compiler fill
>         // them in?

 you have to *cover* all cases. But you can use argument
matching to do that: see below.
>
>         type(... , MyClassWithoutCastToIntOp) {return FALSE;}
>         // can I catch dont care types?

 Yes, but only in the first argument position.

 The type clause is a function.

 No, I dont mean it looks like a function: it *is* a function.
(although, it must be considered a *nested* function :-)

 The compiler consider all the cases, one by one,
mechanically listing them to itself, and decides which 'function'
to call (that is, which select).

 It has to get an unambiguous answer in all cases,
or the constructionis illegal.

 So there is nothing new here really :-)

>     }
>}

 you can say:

 select(v1, v2)
 {
  type(float a1, float a2) { ... }
  type(...) { .. }
 }

if you want: the 'float, float' catches all the numeric cases.
>
>
>void func2()
>{
>     V2 v2;
>     cout << v2;   // I can see how this one would work.
>     cin >> v2;    // but it would be really nice if you could solve this one
>                   // in a general fashion.
>}

 No problem. It works the same:

 select (v2)
 {
  type(int& i) { cin >> i; }
  type(long & j) { cin >> j; }
  type(LongName& l) { cin >> l; }
 }

Does this allow persistent store? No. No magic. You cant
change the type of a variant. Its forever (or at least until
it is destroyed), just like a reference cant change to refer to another
object.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sun, 9 May 1993 20:33:00 GMT Raw View

 In a recent post I said we need some sort of discriminated
union for managing heterogenous aggregates. In the past I have
posted notes on 'variants' which provide such a mechanism and
have some lovely properties.

 Variants, however, suffered from a problem: it is not
possible to do a type changing assignment to a variant.

 A variant is not a type: it can be turned into a type
by encapsulating it in a class (forming a discriminated union type).
Recently, Fergus Henderson showed how this could be used to
implement assignments, solving the one major problem of variants.

 So I want to review the notion of variants, in preparation
for writing a proposal. I'll try to cover the easy bit first:
the easy bit is the semantics. The hard bit is the convincing
argument that variants are essential.

 A variant is not a type. However, you can write:

 variant V [int, long, X, Y];
 V v=2L; // initialises a 'long'

 select(v)
 {
  type(int i) {  cout << i; }
  type(long l) { cout<<l; }
  type(X x) { cout<<x; }
  type(Y y) { cout <<y; }
 }

The 'select' statement and the rules for initialising variants
are the only semantic rules that are really needed for variants.

The select statement is equivalent to a set of anonymous overloaded
function calls, as is initialisation.

My current rules for 'select' require no ambiguity and completeness.

The above select shows a regular pattern in type clauses. It is
permitted to write the abbreviated notation:

 cout<<v;

for this select: this is called automatic expansion.

It can be shown that a pointer to a variant is a variant pointer,
and we say "variants commute with pointers". This means that

 variant [X,Y] *z=something;
 variant [X*, Y*] z=something;

are equivalent. Taking the address of a variant object yields
a variant pointer, dereferencing a variant pointer yields
a variant reference.

It makes sense to have variant functions:

 void f(V v) { cout<<v; }

and these are equivalent to a family of functions:

 void f(int i) { cout<<i; }
 void f(long l) { cout <<l; }
 ...

or to a template function

 template<class V> void f(V v) {cout <<v; }

which is constrained to the variant components (the types long, int, X, Y)
It is also equivalent to an actual constrained generic function

 void f(V v) {
  select(v)
  {
   type(int i) {cout << i; }
   ...
  }
 }

and thus we can say that "variants commute with templates",
or perhaps "variants commute with overloading".

Well, finally, "variants commute with inheritance". You can derive
from a variant: the result is a variant.

You can view 'variants' as a combination of compiler iteration
of alternatives. In the case of declarations, multiple
'ordinary' C++ declarations are generated, whereas
for statements, multiple code sections are generated with
a run time switch based on the type tag.

You can picture a variant object this way: it is like a
union of the types with an associated tag.

Now some techniques. The most important technique is converting
a variant into a type:

 struct dunion {
  V v;
  dunion(V x) : v(x) {}
 };

Now we have a discriminated union. A pointer to a dunion is
an ordinary pointer.

Implementation. Its done the obvious way: the variant object
is a union with an enumerated type tag. The compiler
generates code for each type and uses an indexing technique
to select the appropriate code, like:

 // cout << v
 goto jumptable[v.tag];
 jumptable[]={lint,llong,lX,lY};
 lint:
  cout<< v.int;
  goto endoff
 llong:
  cout <<v.long;
  goto endoff
 lX: ...
  ...
 endoff:

So the run-time support is minimal (and indexed jump).

Comments? Questions?

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sun, 9 May 1993 20:58:56 GMT Raw View

Here is some more on variants.

The 'select' statement is not limited to one
argument: like any other function call you can write

 variant V [int,long];
 V a=something, b=something;
 select(a,b)
 {
  type(int x, int y) { .. }
  type(int x, long y) { ..}
  type(long x, long y) { .. }
 }

Notice that the third type clause catches both signatures

 (int,long) and (long,long)

This illustrates another point about the select statement:
it does not have to provide exact matches on the types.

Variants commute with variants too: the declarations of W below
are equivalent:

 variant V [int, long];
 variant W [V, float];
 variant W [int,float,long];
 variant W [[int, long] float];

which shows again that variants are not types (they are like
a 'collection of types')

A variant object must be initialised. The type of
the initialiser is used to determine what type the variant
object represents: again, overload resolution is used
by the compiler to determine a unique type from the
variant component types.

The object so constructed is an ordinary object of size
of the actual type being initialised. However, the compiler
must keep track of the type, and may need to generate
a tag associated with the variant object.

It is implementation defined where that tag is stored.
If you take the address of a variant, you get the address
of the actual object, not the address of the tag field.

When a variant is initialised by another, the compiler needs
to reserve enough store for the largest component type.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Mon, 10 May 1993 08:46:03 GMT Raw View

maxtal@physics.su.OZ.AU (John Max Skaller) writes:

> In a recent post I said we need some sort of discriminated
>union for managing heterogenous aggregates. In the past I have
>posted notes on 'variants' which provide such a mechanism and
>have some lovely properties.
[...]
> So I want to review the notion of variants, in preparation
>for writing a proposal. I'll try to cover the easy bit first:
>the easy bit is the semantics. The hard bit is the convincing
>argument that variants are essential.

When I first read John Skaller's paper on variants last year,
I was basically quite skeptical. It's all very well to say that
there are these things called variants and they have all these
wonderful properties, but what use are they? Isn't C++ complicated
enough already?  Then a couple of weeks ago I started writing some real
C++ code for a change (as opposed to playing language lawyer on
comp.std.c++ :-). Now that I have experienced the problem first-hand,
I'm convinced that it addressess a genuine need.

The program I'm writing is an interpreter for a dynamically typed
language. It's not really important what the language was, the same
issues would arise whether it was LISP, AWK, or LPC. The key problem was
that the language allowed variables to be used without specifying their
type. Thus the interpreter need to represent the values of those
variables as some sort of union of all the different possible types.
For simplicity I'll assume that there are only two types used, Number
and String.

Initially I started writing it in C, but I soon got fed up with that
and translated what I had written into C++.
Here's how it looked immediately after I had translated from C:

 class String { ... }; // basic string class
 class Number { ... }; // integer class that checks for overflow

 class Mixed {
  enum { UNDEF, NUMBER, STRING } type;
  union { Number n; String s; } value;
 public:
  ...
 };

Now in C, having a union of structured types is fine.
But in C++, I got a compile error!
Why? Because of course I had defined constructors and destructors
for the String and Number classes. But in C++, you can't have a union
of classes with constructors or destructors.

At present, C++ simply does not provide a good way of writing this.

I could use a union of pointers, but this forces me to use dynamic
allocation, which would be prohibitively costly. Remember that
'Mixed' is used to represent every data value in the program I
am executing. The efficiency of Mixed directly effects the efficiency
of the interpreter. (And in fact it's a parallel language, and I
want to place the variables in shared memory, so I would have
to use handles instead of pointers anyway - and my Handle class
has a constructor).

I won't describe the ugly hack I used to solve the problem, except
to say that it's an ugly hack [and that it doesn't even work due
to a bug in my compiler :-(, although hopefully the bug will be fixed
in the next release].

I *will* say that I think the some facility for discriminated
unions is a fundamental feature that is missing from C++,
and until it is fixed people will continue to write ugly,
difficult-to-maintain, and sometimes even non-portable hacks
to get around the problem.

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Mon, 10 May 1993 18:34:48 GMT Raw View

In article <9313018.6437@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>
>> In a recent post I said we need some sort of discriminated
>>union for managing heterogenous aggregates. In the past I have
>>posted notes on 'variants' which provide such a mechanism and
>>have some lovely properties.
>
>When I first read John Skaller's paper on variants last year,
>I was basically quite skeptical.

 I, John Skaller, remain skeptical, which is a good way to
be. I need the support of the readers of this newsgroup to
analyse and refine the proposal both technically and in terms
of perceiving its desirability.

>It's all very well to say that
>there are these things called variants and they have all these
>wonderful properties, but what use are they? Isn't C++ complicated
>enough already?  Then a couple of weeks ago I started writing some real
>C++ code for a change (as opposed to playing language lawyer on
>comp.std.c++ :-). Now that I have experienced the problem first-hand,
>I'm convinced that it addressess a genuine need.
>
>Initially I started writing it in C, but I soon got fed up with that
>and translated what I had written into C++.
>Here's how it looked immediately after I had translated from C:
>
> class String { ... }; // basic string class
> class Number { ... }; // integer class that checks for overflow
>
> class Mixed {
>  enum { UNDEF, NUMBER, STRING } type;
>  union { Number n; String s; } value;
> public:
>  ...
> };

 Note this is the idiom that I have been advocating as an
alternative to downcasting on comp.lang.c++.
>
>Now in C, having a union of structured types is fine.
>But in C++, I got a compile error!
>Why? Because of course I had defined constructors and destructors
>for the String and Number classes. But in C++, you can't have a union
>of classes with constructors or destructors.

 Which makes me very embarassed and make the search for a
compiler supported facility all the more important.

>
>At present, C++ simply does not provide a good way of writing this.
>
>I could use a union of pointers, but this forces me to use dynamic
>allocation, which would be prohibitively costly.

 Even if pointers provided a workable solution, you have to
hand code the 'discriminated union idiom', which is quite
complex and long winded, for every such union.

 Worse, it is hard to make the idiom safe to use: one
ends up relying on remembering to use it correctly.

 I think:

 variant Mixed [String, Number];

is much simpler, dont you?

>I *will* say that I think the some facility for discriminated
>unions is a fundamental feature that is missing from C++,
>and until it is fixed people will continue to write ugly,
>difficult-to-maintain, and sometimes even non-portable hacks
>to get around the problem.

 There is a worse problem, IMHO: X3J16/WG21 has mandated
a new feature that enables a simpler (but unsound) idiom to be
used, namely checked downcasting via RTTI.
It fails to work for all classes, and is invasive in that it
requires derivation from a common base with a virtual function,
and it is unconstrained.

 But the idiom is still so much easier to use than the
discriminated union idiom --- despite attempts to make RTTI
unpalatable for general use --- that I believe it will be used
for general programming of heterogenous containers *unless*
an alternative is supplied.

 The language has extended the notion of 'struct' to 'class'.
Its time to extend the notion of 'union' to 'variant'.

 Structs and unions are both fundamental. A union
is not just a way of saving storage. Structs and unions
represent type aggregation and type unification,
that is, they are the 'AND' and 'OR' operators of type logic.

 We need them both.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: tob@world.std.com (Tom O Breton)
Date: Mon, 10 May 1993 19:31:40 GMT Raw View

John (MAX):

I am wondering where exactly the 'select()' statement would be useful.

Essentially what it does is extract the type-information, bringing it
from the inside to the outside. Why not use some sort of selected member
function? Thereby keeping the type-information inside.

If the type-information needs to be used, why would the programmer use a
union (or variant) to hide it? I may be wrong here; I'm not sure that
it's a genuine contradiction, but I can't think of any good examples of
it making sense.

        Tom

--
The Tom spreads its huge, scaly wings and soars into the sky...
(tob@world.std.com, TomBreton@delphi.com)

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Tue, 11 May 1993 10:13:13 GMT Raw View

In article <C6tswt.68t@world.std.com> tob@world.std.com (Tom O Breton) writes:
>John (MAX):
>
>I am wondering where exactly the 'select()' statement would be useful.

 class bplus;
 class uminus;
 variant Node [bplus, uminus, int];
 struct uminus { Node* arg; };
 struct bplus { Node *left, *right; };

 int eval(Node* n){
  select(n) {
   type(int *x) { return *x; }
   type(uminus *u) { return - eval(u->arg); }
   type(bplus *p) {
    return eval(u->left) + eval(u->right);
   }
  }
 }

>
>Essentially what it does is extract the type-information, bringing it
>from the inside to the outside. Why not use some sort of selected member
>function? Thereby keeping the type-information inside.

 Because the operations on the variant are distributed
all over the place. The above example show evaluation of
a simple parse tree. Now I want to print it in reverse
polish:

 void rp(Node *n) { // I hope i get this right :-)
  select(n) {
   type(int *x) { cout<<*x; }
   type(uminus *u) { rp(u->arg); cout<<" neg";}
   type(bplus *p) {
    rp(u->left);
    rp(u->right);
    cout<<" add";
   }
  }
 }

I dont *want* to encapsulate the functions because I dont know what
they are: I want to encapsulate the relationships of a parse
tree, which I already have done by the class/variant definitions.
>
>If the type-information needs to be used, why would the programmer use a
>union (or variant) to hide it?

 The reason that the type information cannot be accessed
independently of the component of that type is to provide
absolute type safety.

>I may be wrong here; I'm not sure that
>it's a genuine contradiction, but I can't think of any good examples of
>it making sense.

 I have trouble with using casts for the opposite reason:
checked or not:

 X* x=dynamic_cast<X*>(y);
 x->function();

is unsafe .. because the cast happened to give back a 0 in this
case. The select statement *integrates* access to the
run-time type with the use of the type so that you cant
access the wrong type : any attempt to do so results in
a compile time error.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Wed, 12 May 1993 02:26:03 GMT Raw View

For more on C++'s lack of support for discriminated unions, see
the thread "should I use a switch statement?" in comp.lang.c++.

--
Fergus Henderson                     This .signature virus might be
fjh@munta.cs.mu.OZ.AU                getting old, but you still can't
                                     consistently believe it unless you
Linux: Choice of a GNU Generation    copy it to your own .signature file!

Author: maxtal@extro.ucc.su.OZ.AU (John MAX Skaller)
Date: Sun, 3 Jan 1993 04:47:02 GMT Raw View

Variants, Part II.
------------------

The 'type' clauses of a select statement specify an implict
variant:

 variant Num [float, complex, int] x;

 select(x)
 {
  type(double d) { ... }
  type(long l)   { ... }
  type(complex c) { ...}
 }

In this case the implicit variant is

 variant Implicit [double,long,complex];

This clearly indicates how variants interconvert,

 variant X [X1, X2] x;
 variant Y [Y1, Y2] y;

 Y y=x;

The conversion is well defined if each of X1 and X2 unambiguously
converts to either Y1 or Y2, where 'unambiguously converts' is
used in the same sense as the normal overload matching rules.

The select statement can have multiple arguments:

 variant V [V1, V2] v;
 variant W [W1, W2] w;

 select(v,w)
 {
  type(V1 *v1, W1 *w1) { ... }
  type(V2 *v2, W1 *w1) { ... }
  type(V1 *v1, W2 *w2) { ... }
  type(V2 *v2, W2 *w2) { ... }
 }

The select statement can have a 'default' (none-of-the-above)

 select(v)
 {
  type(V1 v1) { ... }
  type(...) { cout << "Type Not Handled"; }
 }

Where variants share a common base, it would be more usual
to do:

 class B;
 class D1 : B;
 class D2 : B;

 variant D [B,D1,D2] *d;

 select(d)
 {
  type(D1 *d1) { ...}  // special case
  type(B *b)   { ... } // default to base class
 }

Where dynamic downcasting is required, it can be constrained
to a one off function:

 D* downcast(B* b)
 {
  D *d;
  if (d=dynamic_cast<D1*>b);
  else if (d=dynamic_cast<D2*>b);
  else d=b;
  return d;
 }

allowing the downcast to be performed exactly once in one lexically
enclosed place, and thereafter guarranting secure access to
the required exact types.

(Note: this example requires variant assignment)

(Note 2: If you are a bit worried about the way the * symbol
got shifted about:

 variant V [V1, V2] *vp;
 variant pV [V1*, V2*] vp; // same as above??

show it makes sense and deduce a variant is not a type,
or wait for part III)

--
;----------------------------------------------------------------------
        JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
 Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
;--------------- SCIENTIFIC AND ENGINEERING SOFTWARE ------------------

Author: maxtal@extro.ucc.su.OZ.AU (John MAX Skaller)
Date: Fri, 1 Jan 1993 17:41:33 GMT Raw View

I am considering the problem of heterogenous aggregates,
in which for example we have an array of pointers to objects
that may be either a V1, V2 or V3 object. These objects need
not share a common base, and even if they did, we do not
wish to use dynamic downcasting for this purpose.

Furthermore, the classes for V1, V2 and V3 may already be
closed and not available for modification, and they
may or may not contain virtual functions.

The 'correct' solution is basically as follows:

 struct V {
  enum {t1,t2,t3} tag;
  union {
   V1 * v1;
   V2 * v2;
   V3 * v3;
  };
  V(V1* x1) : tag(t1), v1(x1) {}
  V(V2* x2) : tag(t2), v2(x2) {}
  V(V3* x3) : tag(t3), v3(x3) {}
 } v;

 switch(v.tag) {
  case t1: ... v.v1 ... break;
  case t2: ... v.v2 ... break;
  case t3: ... v.v3 ... break;
 };

This is better than using a void* and a cast I think, because it constrains
the user to accessing one of the declared types V1, V2 or V3.

However, it is clearly insecure in that the tag need not agree with the
actual type pointed to, and the switch might have the same problem
(to say nothing of forgetting the break).

To fix this problem I invent a secure form of the union called a variant,
its not quite as above but that is the initial picture.

 variant V [V1*, V2*, V3*] v;
 select(v)
 {
  type(V1* v1) { ... v1 ... }
  type(V2* v2) { ... v2 ... }
  type(V3* v3) { ... v3 ... }
 }

By not naming the union components or providing explict access to the type
tag the only way to access the components is via a select statement.
Provided the variant is properly initialised this is statically
secure.

 The code

 v->print();

is also allowed if all the Vi have a print function, it expands to the
obvious select statement without the overhead of replicated source.
In general, any statement containing the name of a variant
(except initialisation)
is nominally expanded into multiple statements in the obvious way.
I say 'nominally' because often we can restructure things to optimise
them, by expanding only up to the smallest unifying subexpression.

Variants are initialised in the obvious way, the static type of the
initialiser is used to set the tag field.

 V v=V1(); // v is now a V1

The usual function overloading/matching rules can be applied
in most places you expect.

***Problem 1**** assignment causes some problems. If interpreted
as user defined assignment, the type cannot be changed, if
the type changes like an initialisation then user defined assignments
will not perform as expected. Mm. BTW: this is the only big problem
I can think of.

It is clearly implied that one can have variant functions:

 f(V) { ... }

which of course accept a variant OR any of its components.

Now to the interesting bits. Naturally one must ask,
what happens if one specialises or instantiates a template
function with a variant or variant object? Do we generate
a single variant function or a family of ordinary functions?

The answer is .. it is implementation defined. It doesnt matter.
Variants commute with templates.

There is more ... but this will do for a start.

Comments? Any help with the assignment problem?


--
;----------------------------------------------------------------------
        JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
 Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
;--------------- SCIENTIFIC AND ENGINEERING SOFTWARE ------------------