Thread

Topic: Use of nested functions

Author: chased@rbbb.Eng.Sun.COM (David Chase)
Date: 22 Jan 1993 18:58:21 GMT Raw View

In article <1993Jan22.081555.12027@us-es.sel.de> reindorf@us-es.sel.de (Charles Reindorf) writes:
>On the subject of "trampolenes" in the implementation of nested functions, this
>presumably means the following appraoch:

> [code generation into stack frame, which works.]
> ... The main objection I can see is that on some operating systems
>you are not allowed to do any run-time code generation.

One workaround is to keep a cache of trampolines, which in this case
are pairs of code+data, where each piece of code is compiled to
reference a particular piece of data.  Some run-time code generation
must still occur.  In the limit case (the least flexible OSes) a
different calling convention must be used for all function calls.

>Other people talk about "bound" functions which presumably means that
>given an object a of class A with member function f, then "a.f" can be
>regarded as a new function (whose pointer can be taken) with a non
>member-function signature.  The only way I can see of using a
>trampolene appraoch for this situation is to place the trampolene for
>the function a.f into the object a itself.

Actually, you could "new" them.  Just allocate the functions on the
"heap", and free them when you are done.  Or, we could take a flying
leap at the slippery slope and add garbage collection.

>This surely makes member function closures an impracticable and
>unworkable addition to the language?

I think the answer is another question: "compared to what?"  I'm
really looking forward to run-time instantiation of templates.

David Chase
Sun

Author: maxtal@extro.ucc.su.OZ.AU (John MAX Skaller)
Date: 26 Jan 93 22:41:07 GMT Raw View

In article <1993Jan22.081555.12027@us-es.sel.de> reindorf@us-es.sel.de (Charles Reindorf) writes:
>On the subject of "trampolenes" in the implementation of nested functions,
>this >presumably means the following appraoch:
>
>Some people say that they "don't like" trampolenes, but don't seem to give any
>reason for it.

 Separation of code and data is an important machine
principle for safety. Some machines (all the good ones :-) do not
allow executing things in the data area.

 Trampolines feel a bit like the Cobol ALTER statement:
dangerous hacks.

>The main objection I can see is that on some operating systems
>you are not allowed to do any run-time code generation.

 Some OS may not grant the user the priviledges required
to generate and execute code on the fly. This is because
it is close to 'self-modifying' code, which is not
generally considerd very safe.

>But there again, in
>these architectures there do exist other,
>addmittedly less efficient workarounds
>to this problem, surely.

 Such as? I would LOVE to hear of one that did not require
the standard C/C++ function pointers to double in size.

>
>Other people talk about "bound" functions which presumably means that given an
>object a of class A with member function f, then "a.f" can be regarded as a new
>function (whose pointer can be taken) with a non member-function signature. The
>only way I can see of using a trampolene appraoch for this situation is to place
>the trampolene for the function a.f into the object a itself. This increases
>the memory overhead for function a. In fact, the compiler would have to do
>this by default for every member function of a, because it won't know in
>advance which will be bound (without ghastly extensions to the language). This
>surely makes member function closures an impracticable and unworkable addition
>to the language?
>

 No, one can just leave it up to the user to delete the
trampoline---as for other dynamically allocated objects. It is not
as 'safe' as puting the trampoline in the object (or a pointer thereto),
but even that isnt safe (the trampoline could still be called
after it was deleted --- very nasty crash following, perhaps sometime
later).

--
;----------------------------------------------------------------------
        JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
 Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
;------ SCIENTIFIC AND ENGINEERING SOFTWARE ---ph:  2 799 8223 --------

Author: reindorf@us-es.sel.de (Charles Reindorf)
Date: Wed, 27 Jan 93 09:40:29 GMT Raw View

In article <1993Jan26.224107.9187@ucc.su.OZ.AU>, maxtal@extro.ucc.su.OZ.AU (John MAX Skaller) writes:
|> In article <1993Jan22.081555.12027@us-es.sel.de> reindorf@us-es.sel.de (Charles Reindorf) writes:
|> >On the subject of "trampolenes" in the implementation of nested functions,
|> >this >presumably means the following appraoch:
|> >
|> >Some people say that they "don't like" trampolenes, but don't seem to give any
|> >reason for it.
|>
|>  Separation of code and data is an important machine
|> principle for safety. Some machines (all the good ones :-) do not
|> allow executing things in the data area.
|>
|>  Trampolines feel a bit like the Cobol ALTER statement:
|> dangerous hacks.
|>
|> >The main objection I can see is that on some operating systems
|> >you are not allowed to do any run-time code generation.
|>
|>  Some OS may not grant the user the priviledges required
|> to generate and execute code on the fly. This is because
|> it is close to 'self-modifying' code, which is not
|> generally considerd very safe.
|>
|> >But there again, in
|> >these architectures there do exist other,
|> >addmittedly less efficient workarounds
|> >to this problem, surely.
|>
|>  Such as? I would LOVE to hear of one that did not require
|> the standard C/C++ function pointers to double in size.

The whole issue is surely a matter dealt with heavily by PASCAL implementations. I suspect
that some implementations (e.g. MS-DOS ones) use trampolenes whereas others use double-size
function pointers. The question is, are C++ implementors ready to tackle the matter. My
feeling is that it adds a great deal of extra power to the language in return for the
penalty. However, I also understand that there is already a greaty deal on the plate for
the C++ standardisation committee. That's my 2 cents worth.

|>
|> >
|> >Other people talk about "bound" functions which presumably means that given an
|> >object a of class A with member function f, then "a.f" can be regarded as a new
|> >function (whose pointer can be taken) with a non member-function signature. The
|> >only way I can see of using a trampolene appraoch for this situation is to place
|> >the trampolene for the function a.f into the object a itself. This increases
|> >the memory overhead for function a. In fact, the compiler would have to do
|> >this by default for every member function of a, because it won't know in
|> >advance which will be bound (without ghastly extensions to the language). This
|> >surely makes member function closures an impracticable and unworkable addition
|> >to the language?
|> >
|>
|>  No, one can just leave it up to the user to delete the
|> trampoline---as for other dynamically allocated objects. It is not
|> as 'safe' as puting the trampoline in the object (or a pointer thereto),
|> but even that isnt safe (the trampoline could still be called
|> after it was deleted --- very nasty crash following, perhaps sometime
|> later).

This idea sounds very messy to me (to say the least). Can you describe the mechanism
for this in the language? My contention here would be : it's really not worth the
trouble. This sort of problem can be solved in other ways: If you have nested functions,
they can sometimes be used to forward the callback and otherwise you are likely best off
using a pointer to an abstract base class in preference to a pointer to a bound function.
There may be examples around where neither of these two approaches are suitable. Does
anybody know of any?

|>
|>
|> --
|> ;----------------------------------------------------------------------
|>         JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
|>  Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
|> ;------ SCIENTIFIC AND ENGINEERING SOFTWARE ---ph:  2 799 8223 --------


--- Charles Reindorf

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Thu, 28 Jan 1993 02:25:53 GMT Raw View

maxtal@extro.ucc.su.OZ.AU (John MAX Skaller) writes:

>reindorf@us-es.sel.de (Charles Reindorf) writes:
>>Some people say that they "don't like" trampolenes, but don't seem to give any
>>reason for it.
>
> Separation of code and data is an important machine
>principle for safety. Some machines (all the good ones :-) do not
>allow executing things in the data area.

That in itself is not a problem, so long as you are allowed to modify
things in the code area ;-)

Seriously, trampolines are not in the least bit unsafe.
Being able to modify the code area can in general be "unsafe", but then there
are many other things that are unsafe: pointers, array indexing,
dynamic allocation, unions, etc. etc.
Do these same hand-holding operating systems only allow you to program in
Ada? (Just kidding :-)

> Trampolines feel a bit like the Cobol ALTER statement:
>dangerous hacks.

So do these same operating systems prevent efficient implementations of the
COBOL ALTER statement?

>>The main objection I can see is that on some operating systems
>>you are not allowed to do any run-time code generation.
>
> Some OS may not grant the user the priviledges required
>to generate and execute code on the fly. This is because
>it is close to 'self-modifying' code, which is not
>generally considered very safe.

Again, why is this less safe than, say, using function pointers?

>>But there again, in
>>these architectures there do exist other,
>>addmittedly less efficient workarounds
>>to this problem, surely.
>
> Such as? I would LOVE to hear of one that did not require
>the standard C/C++ function pointers to double in size.

An approach that doesn't require the ability
to write to code space at runtime is as follows:
 At compile time, the programmer must specify the maximum depth
 of the trampoline stack.
 The compiler allocates space for N pairs of data+code pointers and
 writes out code for N trampolines each of which refer indirectly to the
 corresponding pair of pointers.
 Since these pointers are in the data area, they can be written
 to at runtime with no problem. The trampoline code itself does not
 have to be modified at runtime.

The only problem with this scheme is the requirement that you specify the
trampoline stack depth in advance.  Many systems, eg. DOS, have had this
requirement for the normal data stack, so people using inflexible
operating systems should be able to put up with it for the
trampoline stack.

If you can make the code for all the trampolines the same (eg. by reading the
IP or using IP-relative addressing to determine the location of the
corresponding pair of pointers in the data area) and you also have
the ability to map a single code page to successive linearly
addressable locations, then you could dynamically extend the trampoline
stack, thus avoiding the need to specify the maximum size in advance.

>>... This
>>surely makes member function closures an impracticable and unworkable addition
>>to the language?
>
> No, one can just leave it up to the user to delete the
>trampoline---as for other dynamically allocated objects.

This would require an extension to the language, whereas so far what we
have just been discussing only requires relaxing some restrictions.
What syntax would you use for allocating and deallocating member function
closures?

--
Fergus Henderson             fjh@munta.cs.mu.OZ.AU
This .signature virus is a self-referential statement that is true - but
you will only be able to consistently believe it if you copy it to your own
.signature file!

Author: ark@alice.att.com (Andrew Koenig)
Date: 28 Jan 93 18:30:42 GMT Raw View

In article <dak.728225605@cip-s03> dak@cip-s03.informatik.rwth-aachen.de (David Kastrup) writes:
> One problem with trampolines and self-modifying code is that there
> is no longer a separate, read only code space.

Sure there is; is just doesn't contain the trampolines.  Moreover, I don't
think anyone is talking about self-modifying code -- we're talking
about code that generates other code and then later executes it.
Once generated, the code is never modified.

My main problem with the technique, though, is that sometimes the
read-only code space is the *only* code space, especially in
secure applications.
--
    --Andrew Koenig
      ark@europa.att.com

Author: dak@cip-s03.informatik.rwth-aachen.de (David Kastrup)
Date: 28 Jan 93 12:53:25 GMT Raw View

One problem with trampolines and self-modifying code is that there
is no longer a separate, read only code space. This can complicate
code sharing by demand paging (self-mod code), and be influenced by
the processor's caches and prefetch queues, if self-modifying code is
expected to work under all circumstances.
 I know for a fact that some CPU testers find out about perfectly
object-code compatible CPUs by finding out, under which circumstances
in self-modifying code is the old code executed instead of the
modified one.

The real problem is dealing with separate code space cache controllers,
which are likely not to be designed for that purpose.

Author: chased@rbbb.Eng.Sun.COM (David Chase)
Date: 28 Jan 93 19:40:55 GMT Raw View

In article <dak.728225605@cip-s03> dak@cip-s03.informatik.rwth-aachen.de (David Kastrup) writes:
>One problem with trampolines and self-modifying code is that there
>is no longer a separate, read only code space. This can complicate
>code sharing by demand paging (self-mod code), and be influenced by
>the processor's caches and prefetch queues, if self-modifying code is
>expected to work under all circumstances.

Four points:

First, you are assuming that the compiler-writers are dolts.  We're
not.

Second, use of trampolines is not "self-modifying" code.  It is
run-time code generation.

Third, use of trampolines has the advantage that if you stick to the
language that is now legal (that is, no nested functions) you get no
trampolines, and no run-time code generation.  Hence, everywhere that
you can use the current (no nested functions) language, you can also
use the extended (with nested functions) language.  An "old" program
compiled by a compiler suporting nested functions will generate the
same code that it did under an "old" compiler.  Thus, support for
nested functions in the language does not prevent you from writing C++
on some particular weird platform -- you can write the same programs
that you could before.

Fourth, if you're using a workstation, chances are good that you're
already using a system that makes heavy use of self-modifying code
(and NOT run-time code generation).  Dynamic linking under SunOS
modifies code on the fly, in a signal-safe and thread-safe fashion.
(Try adb'ing through the (first) call to printf in "hello world".)

David Chase
Sun

Author: maxtal@extro.ucc.su.OZ.AU (John MAX Skaller)
Date: Mon, 1 Feb 1993 19:34:04 GMT Raw View

In article <9302813.18482@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON) writes:
>maxtal@extro.ucc.su.OZ.AU (John MAX Skaller) writes:
>
>>
>> Separation of code and data is an important machine
>>principle for safety. Some machines (all the good ones :-) do not
>>allow executing things in the data area.
>
>Seriously, trampolines are not in the least bit unsafe.

 :-)

>Being able to modify the code area can in general be "unsafe", but then there
>are many other things that are unsafe: pointers, array indexing,
>dynamic allocation, unions, etc. etc.

 True.

>>
>> Such as? I would LOVE to hear of one that did not require
>>the standard C/C++ function pointers to double in size.
>
>An approach that doesn't require the ability
>to write to code space at runtime is as follows:
> At compile time, the programmer must specify the maximum depth
> of the trampoline stack.
> The compiler allocates space for N pairs of data+code pointers and
> writes out code for N trampolines each of which refer indirectly to the
> corresponding pair of pointers.
> Since these pointers are in the data area, they can be written
> to at runtime with no problem. The trampoline code itself does not
> have to be modified at runtime.
>
>The only problem with this scheme is the requirement that you specify the
>trampoline stack depth in advance.

 Yes. I'll have to check it out, maybe later even
ask for some example code, but the cache seems to get around
the separate instruction/data banks problem nicely. I think
this will make it possible to do on all architechtures,
so there's no reason now for not going ahead with a proposal.

>Many systems, eg. DOS, have had this
>requirement for the normal data stack, so people using inflexible
>operating systems should be able to put up with it for the
>trampoline stack.

 Yes --- but why fling off at DOS when it is constrained
by the 8086 processor? Any Unix running on an 8086 would have
the same problem.
>
>If you can make the code for all the trampolines the same (eg. by reading the
>IP or using IP-relative addressing to determine the location of the
>corresponding pair of pointers in the data area) and you also have
>the ability to map a single code page to successive linearly
>addressable locations, then you could dynamically extend the trampoline
>stack, thus avoiding the need to specify the maximum size in advance.

 HEY! What a nice trick. How to generate code without
generating code :-)

>>
>> No, one can just leave it up to the user to delete the
>>trampoline---as for other dynamically allocated objects.
>
>This would require an extension to the language, whereas so far what we
>have just been discussing only requires relaxing some restrictions.

 No, the committee would consider any removal of restriction
like this an extension :-)

>What syntax would you use for allocating and deallocating member function
>closures?
>
 I think Scott Turner suggested

 (some cast)x = new object.f;
 delete x;


--
;----------------------------------------------------------------------
        JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
 Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
;------ SCIENTIFIC AND ENGINEERING SOFTWARE ---ph:  2 799 8223 --------

Author: fjh@munta.cs.mu.OZ.AU (Fergus James HENDERSON)
Date: Tue, 19 Jan 1993 15:27:44 GMT Raw View

maxtal@extro.ucc.su.OZ.AU (John MAX Skaller) writes:

> I must admit I hate the 'trampoline' code generating
>solution to nested function/bound member pointers.
>
> Do nested functions and/or bound members HAVE to
>have the same format as ordinary functions?

Yes, otherwise instead of a simple and elegant change to the language
that makes it more orthogonal, you have a complex proposal which adds a new
type of pointer to the language yet provides less functionality.

> Is the cost of using two addresses so high, just
>to allow fast access to global functions?

There are several problems with using two addresses.
First, it means that users pay the cost regardless of whether or not they
use the feature.  Second, using trampolines makes the much more common case
(non-nested function pointers) efficient and the rare case (nested function
pointers) not too bad, whereas using two addresses makes the common case
considerably less efficient.  Third, using two addresses means losing
object-compatibility with all existing object code which is an unacceptable
cost.

--
Fergus Henderson             fjh@munta.cs.mu.OZ.AU
This .signature virus is a self-referential statement that is true - but
you will only be able to consistently believe it if you copy it to your own
.signature file!

Author: reindorf@us-es.sel.de (Charles Reindorf)
Date: Fri, 22 Jan 93 08:15:55 GMT Raw View

On the subject of "trampolenes" in the implementation of nested functions, this
presumably means the following appraoch:

Suppose a function g is nested inside a function f; Any pointer to the function
"g" is simply a pointer to a small "trampolene" compiled by function f into it's
own stack frame. The "trampolene" code adds an extra parameter to the parameters
for function "g" and then calls the real body of function "g" which is already
compiled into the code section. The extra parameter point's to f's stack frame
so that g can access "f"s instance variables.

Some people say that they "don't like" trampolenes, but don't seem to give any
reason for it. The main objection I can see is that on some operating systems
you are not allowed to do any run-time code generation. But there again, in
these architectures there do exist other, addmittedly less efficient workarounds
to this problem, surely.

Other people talk about "bound" functions which presumably means that given an
object a of class A with member function f, then "a.f" can be regarded as a new
function (whose pointer can be taken) with a non member-function signature. The
only way I can see of using a trampolene appraoch for this situation is to place
the trampolene for the function a.f into the object a itself. This increases
the memory overhead for function a. In fact, the compiler would have to do
this by default for every member function of a, because it won't know in
advance which will be bound (without ghastly extensions to the language). This
surely makes member function closures an impracticable and unworkable addition
to the language?

-- Charles Reindorf

Author: chased@rbbb.Eng.Sun.COM (David Chase)
Date: 12 Jan 1993 23:02:09 GMT Raw View

In article <1993Jan11.215111.11291@qualcomm.com> greg@harvey.qualcomm.com (Greg Noel) writes:
>Recursion implies that the context of a pointer must be kept as part of
>the pointer, which leads to either expanding the size of a function pointer
>or generating trampoline code dynamically.  Expanding function pointers
>would be a cost even when nested functions are not used and there are
>some architectures where code cannot be generated on the fly.

I know that code can be generated on the fly on Sparc, 680x0, and
80286, 80386, 80486, PDP-11, VAX, 6502, and IBM 370.  What
architecture were you thinking of?

Given such an architecture, are you sure that its problems cannot be
treated by keeping a cache of generated code fragments (you keep a
per-thread cache of code+data pairs to avoid the cost of page
remapping and cache flushing).  Remember, most machines ARE able to
load programs and run them, so they must have some rudimentary ability
to "generate" code at run-time.

In truth, I don't high hopes for nested functions appearing in C++,
but arguments against them based on their "cost" or "difficulty of
implementation" are not very strong.

David Chase
Sun

Author: maxtal@extro.ucc.su.OZ.AU (John MAX Skaller)
Date: Wed, 13 Jan 1993 17:40:51 GMT Raw View

In article <ll6jfhINN6n3@exodus.Eng.Sun.COM> chased@rbbb.Eng.Sun.COM (David Chase) writes:
>
>In truth, I don't high hopes for nested functions appearing in C++,
>but arguments against them based on their "cost" or "difficulty of
>implementation" are not very strong.
>
 Well, David: you could be right. But I intend to
propose their introduction anyhow.

 I know of one prominent and well respected committee
member who supports their introduction:
anyone else care to indicate their position?

 Anyone who wants to help please email me.
Anyone opposed too.

 Might as well throw in closure of a function over a member
at the same time. (Can I call these 'Bound-Members'?)

 I must admit I hate the 'trampoline' code generating
solution to nested function/bound member pointers.

 Do nested functions and/or bound members HAVE to
have the same format as ordinary functions?

 Is the cost of using two addresses so high, just
to allow fast access to global functions?

 (On the PC, there are enough memory models already
that you could have two models for functions pointers :-(

--
;----------------------------------------------------------------------
        JOHN (MAX) SKALLER,         maxtal@extro.ucc.su.oz.au
 Maxtal Pty Ltd, 6 MacKay St ASHFIELD, NSW 2131, AUSTRALIA
;--------------- SCIENTIFIC AND ENGINEERING SOFTWARE ------------------

Author: erc@netcom.com (Eric Smith)
Date: Wed, 13 Jan 1993 21:03:19 GMT Raw View

In article <1993Jan13.174051.21288@ucc.su.OZ.AU> maxtal@extro.ucc.su.OZ.AU (John MAX Skaller) writes:
>In article <ll6jfhINN6n3@exodus.Eng.Sun.COM> chased@rbbb.Eng.Sun.COM (David Chase) writes:
>>
>>In truth, I don't high hopes for nested functions appearing in C++,
>>but arguments against them based on their "cost" or "difficulty of
>>implementation" are not very strong.
>>
> Well, David: you could be right. But I intend to
>propose their introduction anyhow.
>
> I know of one prominent and well respected committee
>member who supports their introduction:
>anyone else care to indicate their position?

C++ compiler vendors will add nested functions as proprietary extensions
to differentiate their compilers from their competitors'.  A few years
later, when lots of people will have started using such extensions, and
have problems porting their code from one extended C++ compiler to another,
there will be lots of requests for updating the standard.

I hope the committee members don't think they are going to finish the
standard and go home.  This is a permanent job, and they might as well
get used to it.

But it would be nice if they would make the standard official periodically.
I.e., instead of always working on the one and only C++ standard, they
should publish C++ 93 now, and then in 1995 C++ 95, etc.

That way one compiler vendor can't claim to meet the standard more strictly
than another without mentioning which version of the standard.  At least
one major compiler vendor seems to be taking advantage of the confusion for
marketing purposes.  That is, implying that they follow "the standard" more
strictly than their competitors, without mentioning what standard they are
talking about.  From looking at the features they support, I get the
impression they might be referring to the C standard, especially since the
C++ standard is not official yet.

Author: ark@alice.att.com (Andrew Koenig)
Date: 13 Jan 93 17:56:23 GMT Raw View

In article <ll6jfhINN6n3@exodus.Eng.Sun.COM> chased@rbbb.Eng.Sun.COM (David Chase) writes:

> I know that code can be generated on the fly on Sparc, 680x0, and
> 80286, 80386, 80486, PDP-11, VAX, 6502, and IBM 370.  What
> architecture were you thinking of?

For example, if you're using a PDP-11 in `separate I and D space'
mode, you can't generate code on the fly because there's no way
to write (or even read) instruction space; you can only execute it.

More generally, I seem to recall that some secure applications impose
a requirement that instructions may be executed only from read-only
memory that is initialized once from a file at startup time.
--
    --Andrew Koenig
      ark@europa.att.com