Thread

Topic: overloading vs. virtual functions

Author: djones@megatest.UUCP (Dave Jones)
Date: 4 Mar 91 21:25:49 GMT Raw View

Author: craig@gpu.utcs.utoronto.ca (Craig Hubley)
Date: 8 Mar 91 07:10:56 GMT Raw View

In article <15488@prometheus.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <1991Feb25.201923.14554@gpu.utcs.utoronto.ca), by craig@gpu.utcs.utoronto.ca (Craig Hubley):
>)))The fact that C++ implements the two mechanisms [virtual functions and
>)))overloaded functions] so as to create (in my opinion) counter-intuitive
>)))effects is (in my opinion) a flaw in C++.  That is, the programmer must
>)))be very much aware of the "seam" between compile-time and run-time ...
>
>Anybody who can not handle this "seam" should be programming in LOGO.

Yeah, and anyone who can't handle raw bit-shifting should use Pascal.
While we're at it, who needs databases ?  I can open files and dig through
'em.  And who needs CASE ?  I got a gaw-damm disassemblah heah, boah!

I can "handle it".  I have programmed in LOGO, and LISP, and FORTH, and
there are definitely advantages to languages that have interpreters AND
compilers that can make the code act the same.  I have also programmed
in 6502 assembler, 6809 assembler, 68000 assembler, WSL, weird BASICs, C,
and other languages that gave access to the raw bit-level of the box.
Portable code doing this kind of manipulation is far harder to get right,
because the fact that it works on the first 10 boxes you tried it on doesn't
tell you much about the 11th.

Accordingly, modern programming languages don't force you to do much
bit-shifting, allowing you to rely on higher-level primitives most of the
time and only descend into bits when you really need to do so.  The same
is true of compile-time vs. run-time binding, as others have already quite
adequately demonstrated by comparison to other languages.

Nobody is saying it can always be hidden completely, particularly in real-
time applications.  But most of us are saying that it isn't necessary for
language syntax to force programmers to hard-code the difference into
every scrap of code they write.  Consider, what if you write a piece of
nice general source code to deal with three different kinds of objects -
and then call it in a context where, by the language definition, the
type of that argument is KNOWN, unambiguously.

I can't ask my compiler to take advantage of this potential optimization,
if I have also asked it to guarantee me the difference between runtime
and compile-time binding.  To deal with this situation in C++ as it stands,
I have to write three different functions and call THOSE when I know what
the type is.  I can never let the compiler figure it out for me.  If I
want it to do so, I gotta suffer the virtual lookup overhead every time.

The much-prized machine-level efficiency you bit-bangers are so enamored of
comes much more cheaply when it can be generated by a compiler and not a
human programmer.  In the dark ages of C, if you wanted to avoid unnecessary
function call overhead you had to rewrite your function as a macro.  Often
you had a function version and a macro version side-by-side for use in
different circumstances.  In the more enlightened C++, we use "inline" and
let the compiler do it.  Of course we are AWARE of the difference, but it
doesn't cause us anywhere near the inconvenience, doesn't make us maintain
two versions, and if "inline" doesn't always guarantee savings, then that's
all right.  We have to profile the code anyway.

Of course, there is always the possibility that you will not understand
the rules that the compiler uses to make these decisions.  If the only
way a programmer can tell what his code will do, is to hard-code all of
its runtime behavior himself, then I suggest you go back to assembler,
where every little choice was up to you.  And where you had to hack up
your code to make the tiniest little optimization.

--
  Craig Hubley   "...get rid of a man as soon as he thinks himself an expert."
  Craig Hubley & Associates------------------------------------Henry Ford Sr.
  craig@gpu.utcs.Utoronto.CA   UUNET!utai!utgpu!craig   craig@utorgpu.BITNET
  craig@gpu.utcs.toronto.EDU   {allegra,bnr-vpa,decvax}!utcsri!utgpu!craig

Author: chip@tct.uucp (Chip Salzenberg)
Date: 20 Feb 91 19:57:44 GMT Raw View

According to craig@gpu.utcs.utoronto.ca (Craig Hubley):
>The fact that C++ implements the two mechanisms [virtual functions and
>overloaded functions] so as to create (in my opinion) counter-intuitive
>effects is (in my opinion) a flaw in C++.  That is, the programmer must
>be very much aware of the "seam" between compile-time and run-time ...

This fact is a *necessary* consequence of one of C++'s design goals,
namely, that a programmer never pay more in efficiency than necessary.
That means that it is the programmer who decides when to use run-time
binding (virtual) and when to use compile-time binding (overloading).
If you remove the distinction, you remove the control -- a tradeoff
that is, to me, unacceptable.

>Consider how static initialization works in C++.  The initialization
>syntax means that some functions consist entirely of initialization
>and calls to these can (in principle) be resolved entirely at compile-
>time if they are called with static arguments.

Not so!  Those initializations cannot, even in principle, be resolved
at compile time if the actual types of their reference or pointer
parameters are unknown at compile time -- which is exactly the
situation for which virtual functions were created.  So your scenario
is only true in a "(C++)--" language without virtual functions.
That's one way to merge two features: get rid of one of them.  :-)

>>a base class B and a derived class D:
>>
>>    extern void foo(B&);
>>    extern void foo(D&);
>>    D d; B& b = d; foo(b);
>>
>>It is the "foo(B&)" function that will be called ...
>>
>>    D d; B& b = d; b.foo();
>>
>>It is B::foo() that is called ...
>
>In neither of the cases above can the system be expected to read the
>programmer's mind and "decide" to call the function associated with
>the "real" rather than the "formal" type.  In fact, C++ explicitly uses
>the -> vs. . notation so that the programmer can make that decision
>him/herself.

The notation is irrelevant.  Given a pointer "a" with member "b",
"a->b" and "(*a).b" are exactly equivalent in all cases.  I suppose
you knew that; so what did you really mean?

>There is no "cast to the dynamic type" operator in C++ that would
>delay this decision until run-time.  You can't even find out what
>the type is.

That's a necessary result of the most basic type rules.  And there's
nothing special about overloaded functions in this context: normal
functions and non-virtual member functions are in exactly the same
boat.  If dynamic type loss is unendurable to you, then you're using
the wrong language, because it's not going away.

>You have a good analogy there, but what you are proving is that overloading
>is solving the same problem as member functions, but without the other half
>of its brain:  runtime resolution.  When you mix virtuals and overloading,
>things get scary ...

"Well, don't do that, then."

I would contend that, once virtual functions have been introduced into
a class hierarchy, the additional use of overloaded functions is the
design error to be corrected, not some presumed deficiency in C++.
The differences between these techniques is a tool, not a flaw.

As a practical implementation issue, it is obvious that non-friend
non-member functions are indefinite in number.  Presuming a vanilla
implementation of virtual functions, how can a newly created object
have a virtual function table containing the addresses of overloaded
functions that haven't even been compiled yet?

>But if "orthogonal" in this sense is supposed to mean "never interact
>at all" you are wrong.

But that's not what I meant, because that's not what the word means!
Compare the Jargon File 2.6.3:

orthogonal: [from mathematics] adj. Mutually independent;
   well separated; sometimes, irrelevant to.  Used in a generalization
   of its mathematical meaning to describe sets of primitives or
   capabilities which, like a vector basis in geometry, span the
   entire `capability space' of the system and are in some sense
   non-overlapping or mutually independent.  ...

A language may have templates or not, and it may have inheritance or
not; so the features are orgthogonal.  But of course they interact;
it's all one language.

>As a programmer, I couldn't care less at what point in the
>compile/link/load/run process these things are resolved, except
>where the language forces me to be aware of it.

Well, then, you're using the wrong language, or else you need to
change your thinking.  (No smiley here.)  If your mental view of the
type system does not include the distinction between compile time and
run time, then C++ is simply a bad match for the way you think.  As I
mentioned up front, the *design goals* of C++ simply cannot allow all
type resolution to be (conceptually) delayed until run time.  Such a
language wouldn't be C++ any more.

>Obviously you have to be aware of these subtle differences sometimes but
>inventing wholly different syntax for each situation is IMHO a mistake.

Virtual and non-virtual member functions use the same syntax; the only
difference is one keyword.  However, once you decide: "But I don't
want to use a member function," then you have stepped outside the
bounds of "same things," so you shouldn't expect "same syntax."
--
Chip Salzenberg at Teltronics/TCT      <chip@tct.uucp>, <uunet!pdn!tct!chip>
"It's not a security hole, it's a SECURITY ABYSS." -- Christoph Splittgerber
   (with reference to the upage bug in Interactive UNIX and Everex ESIX)

Author: dsouza@optima.cad.mcc.com (Desmond Dsouza)
Date: 24 Feb 91 23:46:48 GMT Raw View

In article <27C2D4B8.3AD3@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:


 >   >But if "orthogonal" in this sense is supposed to mean "never interact
 >   >at all" you are wrong.
 >
 >   But that's not what I meant, because that's not what the word means!
 >   Compare the Jargon File 2.6.3:
 >
 >   orthogonal: [from mathematics] adj. Mutually independent;
 >      well separated; sometimes, irrelevant to.  Used in a generalization
 >      of its mathematical meaning to describe sets of primitives or
 >      capabilities which, like a vector basis in geometry, span the
 >      entire `capability space' of the system and are in some sense
 >      non-overlapping or mutually independent.  ...
 >
 >   A language may have templates or not, and it may have inheritance or
 >   not; so the features are orgthogonal.  But of course they interact;
 >   it's all one language.

In discussing programming language features, "orthogonal" has a more
specific meaning. Here is one (from M. Jazayeri, "Programming Language
Concepts", p.15)  :

 The principle of 'orthogonality': language features can be
        composed in a free and uniform manner with predictable effects
 and without limitations.

i.e. any meaningful composition of language constructs should be allowed.

For example, here are some reasons why Pascal is not orthogonal:
1. Files cannot be passed by value.
2. Functions can only return values of some restricted types.

 >   --
 >   Chip Salzenberg at Teltronics/TCT      <chip@tct.uucp>, <uunet!pdn!tct!chip>
 >   "It's not a security hole, it's a SECURITY ABYSS." -- Christoph Splittgerber
 >      (with reference to the upage bug in Interactive UNIX and Everex ESIX)
 >
 >

Desmond.
--

-------------------------------------------------------------------------------
 Desmond D'Souza, MCC CAD Program | ARPA: dsouza@mcc.com | Phone: [512] 338-3324
 Box 200195, Austin, TX 78720 | UUCP: {uunet,harvard,gatech,pyramid}!cs.utexas.edu!milano!cadillac!dsouza

Author: craig@gpu.utcs.utoronto.ca (Craig Hubley)
Date: 25 Feb 91 20:19:23 GMT Raw View

In article <27C2D4B8.3AD3@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>According to craig@gpu.utcs.utoronto.ca (Craig Hubley):
>>The fact that C++ implements the two mechanisms [virtual functions and
>>overloaded functions] so as to create (in my opinion) counter-intuitive
>>effects is (in my opinion) a flaw in C++.  That is, the programmer must
>>be very much aware of the "seam" between compile-time and run-time ...
>
>This fact is a *necessary* consequence of one of C++'s design goals,
>namely, that a programmer never pay more in efficiency than necessary.

Sometimes you must/can be aware of the seam.  However, it is possible for
the differences to be minimized and made consistent.  Again, nobody is
suggesting that a programmer or library who does not use a mechanism
should pay the efficiency price for that mechanism.  No such compromise
is required, and I think would be unacceptable in C++.

>That means that it is the programmer who decides when to use run-time
>binding (virtual) and when to use compile-time binding (overloading).
>If you remove the distinction, you remove the control -- a tradeoff
>that is, to me, unacceptable.

You can remove the distinction in many cases without removing the control.
Take overloading:  if I provide a function that accepts both Date("Feb.25/91")
and Date(91, 2, 25) the user needn't be concerned with the fact that I have
actually defined two separate functions and one or the other is called, so
long as I have done my homework, made my functions consistent, and documented
what to expect in each case.  I still have the "control" of allowing Date to
take another argument list, or making its functions virtual, etc..  I have
simply removed a detail from the user's sight, UNLESS he/she chooses to
investigate it, in which case, as you state, the fact that two functions
(that may be totally independent of each other) exist.

In other words, I could build one function Date(year, month, day) and provide
a variety of interfaces that do nothing but find/rearrange the arguments and
call it, or I could build several independent functions.  All of these choices
have runtime efficiency implications (e.g. Date(year, month, day) will be
guaranteed faster than Date(char*)) but this is not explicit in the syntax
the programmer uses.  It is an implementation concern.

>>Consider how static initialization works in C++.  The initialization
>>syntax means that some functions consist entirely of initialization
>>and calls to these can (in principle) be resolved entirely at compile-
>>time if they are called with static arguments.
>                         ^^^^^^^^^^^^^^^^^^^^^
>Not so!  Those initializations cannot, even in principle, be resolved
>at compile time if the actual types of their reference or pointer
>parameters are unknown at compile time -- which is exactly the
>situation for which virtual functions were created.  So your scenario

Then they wait until runtime.  But the types of many values *are* known
at compile time.  I guess I should rephrase the underlined portion as
"with static arguments that have no (derived types AND virtual functions)".
You are right, in these cases the functions to be called are ambiguous
and no such compile-time resolution could occur.

>>programmer's mind and "decide" to call the function associated with
>>the "real" rather than the "formal" type.  In fact, C++ explicitly uses
>>the -> vs. . notation so that the programmer can make that decision
>>him/herself.
>
>The notation is irrelevant.  Given a pointer "a" with member "b",
>"a->b" and "(*a).b" are exactly equivalent in all cases.  I suppose
>you knew that; so what did you really mean?

Yes, sorry.  The notation is irrelevant.  However, the choice of calling the
"real" (Type *x; x->member; Type &y; y.member) or "formal" (Type x; x.member)
(type-associated) function still belongs to the consuming programmer.
Regardless of the version of the "real" notation used,  he/she still has
the option of calling the "formal" version.  In fact, this is still the
default way to access members (no * or & or -> required!).  So this choice
is STILL up to the consuming programmer, even though the producing
programmer can decide to provide a more dynamically resolved function.

Therefore, it is false that providing a more powerful and subtle mechanism
to the producing programmer must necessarily impose either an inefficiency
or a clumsy syntax on the consuming programmer.  Unless you consider ->
clumsy.

>>There is no "cast to the dynamic type" operator in C++ that would
>>delay this decision until run-time.  You can't even find out what
>>the type is.
>
>That's a necessary result of the most basic type rules.  And there's

Then why are Bjarne/Andy in favor of it ?  At least so I've heard...
although apparently they want failures to raise an exception.

Again, do not believe that a language that supports C's original types
and conversions, plus its own, plus the base/derived conversions, has
ANY "most basic type rules".  If anything, it has several conflicting
sets of rules and an oft-quoted "double standard" between builtin and
user-defined types.

>nothing special about overloaded functions in this context: normal
>functions and non-virtual member functions are in exactly the same
>boat.  If dynamic type loss is unendurable to you, then you're using
>the wrong language, because it's not going away.

Dynamic type loss is OK if the type is not useful at runtime.  However,
it can be useful in many more ways than deciding which version of its
predefined (and often unchangeable) functions to call at runtime.  These
other uses can be supported without requiring every object to carry a
type tag through to runtime.

>>You have a good analogy there, but what you are proving is that overloading
>>is solving the same problem as member functions, but without the other half
>>of its brain:  runtime resolution.  When you mix virtuals and overloading,
>>things get scary ...
>
>"Well, don't do that, then."

Then C++ is a far less powerful language.  Others have already pointed
out some of the problems created by the counter-intuitive/restrictive rules
for virtuals/overloads.  I have already pointed out that every other "O-O"
language supports the necessary changes.  There are two ways for C++ to
become another Ada (i.e. a failure):
 - try to incorporate *every* way of doing something
 - fail to incorporate *at least one* way of doing something

As it stands, you cannot add any type- and context-specific functionality
to an object without access to its source code.  This greatly restricts
reusability, to a level between that of say, Eiffel, and that of Ada.

I do not consider this sufficient to support a components industry, as it
leaves too many ways for producing programmers to cut off options for those
building on their code.  Therefore, most consuming programmers won't build
on others' code.  In your applications, this may be fine.  In mine, they
aren't.  There are other issues too, like overall source code size and
a minimal number/namespace of functions.

>I would contend that, once virtual functions have been introduced into
>a class hierarchy, the additional use of overloaded functions is the
>design error to be corrected, not some presumed deficiency in C++.

So you are saying *never* to overload virtuals ?  If so, then that is the
strongest admission I can think of, that the present rules for doing so
are useless.

>The differences between these techniques is a tool, not a flaw.

It is a flaw, not a tool.

>As a practical implementation issue, it is obvious that non-friend
>non-member functions are indefinite in number.  Presuming a vanilla
>implementation of virtual functions, how can a newly created object
>have a virtual function table containing the addresses of overloaded
>functions that haven't even been compiled yet?

They can't.  Which is why the type tag is necessary to sort out type-
dependent processing.  You are continuing to prove that defining a new
virtual in the base type, or allowing it to be "patched in" is very
bad solution.  What do you mean by "vanilla implementation"?

>orthogonal: [from mathematics] adj. Mutually independent;
>   well separated; sometimes, irrelevant to.  Used in a generalization
>   of its mathematical meaning to describe sets of primitives or
>   capabilities which, like a vector basis in geometry, span the
>   entire `capability space' of the system and are in some sense
>   non-overlapping or mutually independent.  ...
>
>A language may have templates or not, and it may have inheritance or
>not; so the features are orgthogonal.  But of course they interact;
>it's all one language.

Then they are mutually independent but not necessarily "well separated"
(they both impact the same source code) and certainly not "irrelevant to".
I would suggest it is still a poor choice of word.  It emphasizes the
PoV of a mediocre compiler designer who can't figure out how to deal
with their interaction, rather than a programmer using the compiler who
must write source code that works predictably using both mechanisms.

Latitude and longitude are "orthogonal" in this sense too, as no doubt
observed by the Captain of the Exxon Valdez... who was probably staring
at the pretty lights on his console as his ship went up on the rocks.

>>As a programmer, I couldn't care less at what point in the
>>compile/link/load/run process these things are resolved, except
>>where the language forces me to be aware of it.
>
>Well, then, you're using the wrong language, or else you need to
>change your thinking.  (No smiley here.)  If your mental view of the

I reject both of your options.  I don't mind if the language forces me
to be aware of the distinction where necessary for efficiency.  But
th distinction we are discussing is "*not* necessary for efficiency but
something falsely called "strong typing".

>type system does not include the distinction between compile time and
>run time, then C++ is simply a bad match for the way you think.  As I
>mentioned up front, the *design goals* of C++ simply cannot allow all
>type resolution to be (conceptually) delayed until run time.  Such a
>language wouldn't be C++ any more.

I agree with the design goals.  However, conceptually/potentially delaying
type resolution until runtime for all objects (i.e. supporting a typeof(x) or
x.has_member(diamter), etc.) is being implemented in some C++ compilers
and is under consideration by ANSI in a variety of forms.

The authors of those compilers (and ANSI!) would be surprised to learn that
 - C++ programmers don't want it
 - C++ don't need it
 - They are no longer writing C++ compilers.

>>Obviously you have to be aware of these subtle differences sometimes but
>>inventing wholly different syntax for each situation is IMHO a mistake.
>
>Virtual and non-virtual member functions use the same syntax; the only
>difference is one keyword.  However, once you decide: "But I don't
>want to use a member function," then you have stepped outside the
>bounds of "same things," so you shouldn't expect "same syntax."

Wrong.  Once I decide "I don't want an attribute that is associated with
the object" then I have stepped outside the bounds of "same things".  An
object is clearly associated (very strongly) with its type.

--
  Craig Hubley   "...get rid of a man as soon as he thinks himself an expert."
  Craig Hubley & Associates------------------------------------Henry Ford Sr.
  craig@gpu.utcs.Utoronto.CA   UUNET!utai!utgpu!craig   craig@utorgpu.BITNET
  craig@gpu.utcs.toronto.EDU   {allegra,bnr-vpa,decvax}!utcsri!utgpu!craig

Author: chip@tct.uucp (Chip Salzenberg)
Date: 27 Feb 91 22:34:03 GMT Raw View

According to dsouza@optima.cad.mcc.com (Desmond Dsouza):
>In discussing programming language features, "orthogonal" has a more
>specific meaning. Here is one (from M. Jazayeri, "Programming Language
>Concepts", p.15)  :
>
> The principle of 'orthogonality': language features can be
>        composed in a free and uniform manner with predictable effects
> and without limitations.
>
>i.e. any meaningful composition of language constructs should be allowed.

I cannot accept this definition.  It is too broad for useful dialogue.
Unless, of course, the statement ``No currently existing computer
language is orthogonal'' is useful dialogue.  :-)
--
Chip Salzenberg at Teltronics/TCT      <chip@tct.uucp>, <uunet!pdn!tct!chip>
"It's not a security hole, it's a SECURITY ABYSS." -- Christoph Splittgerber
   (with reference to the upage bug in Interactive UNIX and Everex ESIX)