Topic: Template efficiency


Author: James.Kanze@dresdner-bank.com
Date: 1999/07/14
Raw View
In article <199907092240.RAA19353@ares.flash.net>,
  blargg@flash.net wrote:
>
> In article <3.0.5.32.19990709061427.00a2c540@mail.wyssware.com>, Craig
> Wyss <cwyss@wyssware.com> wrote:
>
> > > .. James.Kanze@dresdner-bank.com wrote:
> > >
> > > >   ncm@nospam.cantrip.org (Nathan Myers) wrote:
> > >[snip]
> > > >>>>> One compiler implementation trick, when generating code for
> > > >>>>> a specialization, is to note whether a template parameter
> > > >>>>> actually affected the resulting code.  If not, the
> > > >>>>> definition can be named with a dummy placeholder mangled in,
> > > >>>>> instead of the actual template argument, and just aliased by
> > > >>>>> the real mangled name.  Then, if another specialization is
> > > >>>>> generated the same way, the same definition satisfies both,
> > > >>>>> and the linker can toss one copy.

> > > >>>> A more brute-force solution would be to just have the linker
> > > >>>> commonize functions that contain the same sequence of bytes.

> > > [snip]
> > > > Of course, this simply means that when the compiler merges like
> > > > template instantiations, it must keep a separate entry point
> > > > (one jump instruction) for each function.  A clever compiler
> > > > could even avoid this if the address of the function were never
> > > > taken.

> > > Yes. This is what I had in mind. So if such a function were called
> > > through a pointer-to-function, it would have a little extra
> > > overhead, unless the runtime model already uses some type of
> > > thunk, in which case there would be separate thunks that all point
> > > to the merged function (for example, the PowerPC model uses them
> > > to allow switching between the global context of functions, called
> > > the Table of Contents, or TOC).

> > My experience indicates that the MSVC++ V5.0 Linker will silently do
> > this collapse without thunking iff debug information is not included
> > in the linked output.

> You know, you'd think, if a compiler is being this smart (merging
> duplicate generated code), they'd realize that they have to go "all
> the way" and make it work correctly, by being just a little bit
> smarter. If you're going to break the program "optimizing it", you
> might as well just generate "int main() { }". It's much simpler to
> implement. But we're talking Microsoft here, so all bets are off.

I don't know how things are organized at Microsoft, but it is the sort
of error which doesn't surprize me. To "make it work correctly", you
still have to realize all of the implications of correctly.  And the use
of separate addresses to preserve identity is not an immediately obvious
requirement.  Especially as the optimization will occur in the
back-end/linker, which is typically written by a completely different
team than the front-end.  The people writing the front-end have no idea
that the code may be merged, and the people writing the optimization
have no idea that the generated code might require this functionality.

It's also the sort of error that rarely causes problems.  In over 25
years of programming, I can't remember ever having written code which
depended on the separate identity of separate functions.

> Perhaps you could run the simple code I posted through MSVC++ without
> debug information, and see if it fails. MSVC++ may not support
explicit
> function template arguments, so I'll wrap it in a struct this time:
>
>     template<typename T>
>     struct X {
>         static void f() { }
>     };
>
>     int main() {
>         assert( &X<int>::f == &X<double>::f ); // the & is optional

According to the standard, the & is *not* optional, but some compilers
allow omitting it as an extension.

>         return 0; // MSVC++ probably requires this too, heh
>     }

> > Two of us spent days tracking down a bug caused when the linker
> > assigned the same pointer value to two distinct functions (these
> > pointers were being used as "op codes" in a tokenized interpreter).
> > Since the problem only occurred when debugging information was
> > stripped from the executable file, the bug had to be tracked at the
> > assembly code level.

> > I still can not find this collapse of indentical functions behavior
> > documented anywhere, and it was certainly an unpleasant way to
> > discover what would normally be a nice link optimization.

> I am surprised a compiler actually does this, and that it was a
> problem in actual code.

> That sounds like a pain to find.

Compiler bugs generally are:-).

> I have a set of tests that I run on the compiler I use. I have seen,
> time and time again, that they don't seem to do even the most basic
> testing of the thing.

This is regretfully true.  Microsoft has a pretty bad reputation when it
comes to quality control, but I've seen embarressing errors in a number
of different compilers -- bugs you'd expect the simplest QA to find.

> I consider the areas of the implementation that
> can be tricky (such as this), and write test cases (asserts, of
> course) that insure the behavior is correct, or at least known. Not
> that I use anything on a Wintel box :-)

--
James Kanze                         mailto:
James.Kanze@dresdner-bank.com
Conseils en informatique orient   e objet/
                        Beratung in objekt orientierter
Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany  Tel. +49 (069) 63 19 86
27


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: blargg@flash.net
Date: 1999/07/10
Raw View
In article <3.0.5.32.19990709061427.00a2c540@mail.wyssware.com>, Craig
Wyss <cwyss@wyssware.com> wrote:

> > .. James.Kanze@dresdner-bank.com wrote:
> >
> > >   ncm@nospam.cantrip.org (Nathan Myers) wrote:
> >[snip]
> > >>>>> One compiler implementation trick, when generating code for a
> > >>>>> specialization, is to note whether a template parameter actually
> > >>>>> affected the resulting code.  If not, the definition can be named
> > >>>>> with a dummy placeholder mangled in, instead of the actual template
> > >>>>> argument, and just aliased by the real mangled name.  Then, if
> > >>>>> another specialization is generated the same way, the same
> > >>>>> definition satisfies both, and the linker can toss one copy.
> > >>>>
> > >>>> A more brute-force solution would be to just have the linker
> > >>>> commonize
> > >>>> functions that contain the same sequence of bytes.
> > [snip]
> > > Of course, this simply means that when the compiler merges like template
> > > instantiations, it must keep a separate entry point (one jump
> > > instruction) for each function.  A clever compiler could even avoid this
> > > if the address of the function were never taken.
> >
> > Yes. This is what I had in mind. So if such a function were called through
> > a pointer-to-function, it would have a little extra overhead, unless the
> > runtime model already uses some type of thunk, in which case there would
> > be separate thunks that all point to the merged function (for example, the
> > PowerPC model uses them to allow switching between the global context of
> > functions, called the Table of Contents, or TOC).
>
> My experience indicates that the MSVC++ V5.0 Linker will silently
> do this collapse without thunking iff debug information is not included
> in the linked output.

You know, you'd think, if a compiler is being this smart (merging
duplicate generated code), they'd realize that they have to go "all the
way" and make it work correctly, by being just a little bit smarter. If
you're going to break the program "optimizing it", you might as well just
generate "int main() { }". It's much simpler to implement. But we're
talking Microsoft here, so all bets are off.

Perhaps you could run the simple code I posted through MSVC++ without
debug information, and see if it fails. MSVC++ may not support explicit
function template arguments, so I'll wrap it in a struct this time:

    template<typename T>
    struct X {
        static void f() { }
    };

    int main() {
        assert( &X<int>::f == &X<double>::f ); // the & is optional
        return 0; // MSVC++ probably requires this too, heh
    }

> Two of us spent days tracking down a bug caused when the linker
> assigned the same pointer value to two distinct functions (these
> pointers were being used as "op codes" in a tokenized interpreter).
> Since the problem only occurred when debugging information was
> stripped from the executable file, the bug had to be tracked at the
> assembly code level.
>
> I still can not find this collapse of indentical functions behavior
> documented anywhere, and it was certainly an unpleasant way to
> discover what would normally be a nice link optimization.

I am surprised a compiler actually does this, and that it was a problem in
actual code.

That sounds like a pain to find.

I have a set of tests that I run on the compiler I use. I have seen, time
and time again, that they don't seem to do even the most basic testing of
the thing. I consider the areas of the implementation that can be tricky
(such as this), and write test cases (asserts, of course) that insure the
behavior is correct, or at least known. Not that I use anything on a
Wintel box :-)


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]