Topic: C++ Name Mangling Standard


Author: Greg Jaxon <gpjaxon@home.com>
Date: Mon, 18 Dec 2000 14:10:39 GMT
Raw View
> > > http://reality.sgi.com/dehnert_engr/cxx/abi.html

>    That's a sad spec.  They went to all the trouble to define a
> new object format requring a new linker, specifically for C++,
> yet didn't solve any of the problems left over from C++ implementations
> on dumb legacy linkers.
>
>    Specifically:
>
>         If a new linkage mechanism is being defined, why is name
>         mangling still being used at all?  There should be a binary
>         representation for type data, and the user should see
>         diagnostics that look like C++ declarations.

Probably because there is no meaningful difference between having
a "binary" representation vs having a "character" representation.
Expressing type signatures as mangled names gives non-C++ languages
a shot at interoperability.  Your environmental software should
be able to present this data from both viewpoints.

>         This spec still calls for generating vtables at compile
>         time, which leads to pulling in unreachable member functions.
>         A linker designed for C++ should generate vtables at link
>         time, with entries only for reachable functions.

How can you define "reachable" when vtable contents might change
during dynamic linking?   Does the ABI really rule out deferred loading?
I think that trying to prune the loaded image this way is going to
complicate the construction of static objects that may  (or may not)
be brought in depending on when the implementation realizes that they're
reachable.

>         There's no support for folding duplicate instantiations of
>         template functions.  Duplicate functions (defined as
>         byte-for-byte object code duplicates) should be merged,
>         and sufficient information should be preserved that
>         debuggers are not confused by this.

IBM holds a patent on the approach you mention.  Compaq, HP, and KAI each
have alternative approaches that eliminate duplicates at various stages
of the linkage process.   It isn't clear that the ABI has to do anything
particular to support unique instantiations.

>    There's an opportunity here to fix some of the linker-related
> dirty laundry in C++, but it's being missed.
>
>                                         John Nagle
>                                         Animats

I personally think that the SGI ABI proposal should have been developed under
the auspices of an approved standards committee to avoid the appearance of a
restriction of trade.  But it should definitely be developed!  Primarily so
that dynamic libraries from separate software houses can interoperate even
if they were built by different compilers, and secondarily so that several
compiler vendors can compete on each platform.

Greg Jaxon

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "Tony" <tony@my.isp.net>
Date: Mon, 18 Dec 2000 17:47:16 GMT
Raw View
It does though for C-like compatibility though right? I was concerned about
the loss when coming from
the C world into C++.  I hear ya, mapping of objects in memory, vtables etc
are another ball game. I don't think
I'm really concerned about those things though. If I can (1) export
functions from DLLs and (2) export class
interfaces, I'm a happy camper.

Tony

"Greg Comeau" <comeau@panix.com> wrote in message
news:91c17c$e2v$1@panix3.panix.com...
> In article <yN9_5.2134$pi2.147536@bgtnsc07-news.ops.worldnet.att.net>,
> Tony <tony@my.isp.net> wrote:
> >The issue I was alluding to is that all vendors do it differently,
> >hence there is no mangling standard, which means no object
> >code level compatibility.
>
> But object code compatibility != mangling alone
>
> - Greg


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "Ken Hagan" <K.Hagan@thermoteknix.co.uk>
Date: Wed, 20 Dec 2000 16:15:43 GMT
Raw View
"Greg Comeau" <comeau@panix.com> wrote in message
>
> But object code compatibility != mangling alone

"Tony" <tony@my.isp.net> wrote...
> It does though for C-like compatibility though right?

'fraid not. The following examples relate to the Win32/x86 platform.

Function call sequences.
    What order are arguments pushed?
    Are they always passed on the stack?
    Does the caller or callee clean up afterwards?
    Which registers (if any) have to be preserved by functions?

Structure Alignment.
    Compilers may align structure members on 1,2,4,8 or 16 byte
    boundaries. There is a natural alignment that the CPU favours,
    (that depends on the types within the structure) but you might
    want to choose a different alignment that also suits other
    kinds of CPU, in the name of portability. (Once upon a time,
    Win32 ran on Alphas, MIPS, PowerPC and Intel 860. We're down
    to just x86 now, but MS promise Itanium "real soon now".)

Types.
    I am assured that Borland supports "long double" as an 80-bit
    type, which is natural for the CPU, but I know that Microsoft
    make it an alias of plain 64-bit "double". Again, this is done
    in the name of portability to other CPUs that don't offer the
    wider type.

On other platforms, with less reason to follow the standards laid
down by a large OS, vendors might differ on the width of integral
types. Functions returning bool, for example, might return the
value in the carry flag. (This was the standard practice in the days
when "real" x86 programs were written in assembly language.)

You also need to provide a single implementation of the standard
library so that all object modules use the same heap, the same
setjmp/longjmp and for the benefit of those library functions that
use internal variables with static duration.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Tue, 12 Dec 2000 21:00:52 GMT
Raw View
In article <B4sZ5.7$_L2.253@burlma1-snr2>, Barry Margolin
<barmar@genuity.net> wrote:
>In article <05d701c0625c$096178b0$0500a8c0@dragonsys.com>,
>David Abrahams <abrahams@mediaone.net> wrote:
>>> No, a C++ ABI  specification was completed earlier this year, see
>>>
>>> http://reality.sgi.com/dehnert_engr/cxx/abi.html
...
>>My impression was that this ABI would only be used for a specific processor
>>architecture, namely the Intel chip formerly known as "Merced". Was I
>>mistaken? I hope so!
>
>By definition, an ABI is specific to a particular processor, since it
>specifies how the object code should be generated.  Since it makes no sense
>to link together object files generated for different processors, there's
>no need for an ABI to be cross-platform.  Usually an ABI is specific to a
>particular combination of processor and OS, since it also specifies how the
>program makes use of OS services (e.g. how to trap into the kernel).

So one should probably look for an ABCI, "application byte-code
interface", then. Or an ABI restricted to the interfacing between the
functions, so that one can link together object code for different
processors, if one only makes sure each code fragment is run on a
processor of the right type.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: kuehl@ramsen.informatik.uni-konstanz.de (Dietmar Kuehl)
Date: Tue, 12 Dec 2000 21:03:33 GMT
Raw View
Hi,
Barry Margolin (barmar@genuity.net) wrote:
: In article <05d701c0625c$096178b0$0500a8c0@dragonsys.com>,
: David Abrahams <abrahams@mediaone.net> wrote:
: >My impression was that this ABI would only be used for a specific processor
: >architecture, namely the Intel chip formerly known as "Merced". Was I
: >mistaken? I hope so!

Basically the same conventions for name mangling, exception handling,
object layout, calling conventions, you name it, can be used on other
platforms, too.  There are some details which have to be fixed for
other platforms but basically the ABI can be carried over to other
platforms, too.

: By definition, an ABI is specific to a particular processor, since it
: specifies how the object code should be generated.  Since it makes no sense
: to link together object files generated for different processors, there's
: no need for an ABI to be cross-platform.  Usually an ABI is specific to a
: particular combination of processor and OS, since it also specifies how the
: program makes use of OS services (e.g. how to trap into the kernel).

Sure, the object files and certain details of the ABI will be platform
specific but the basic approach does not have to be changed. I think
the idea of the SGI ABI is to be in some sense portable across
different platforms very much like ELF is available on different
platforms. Potentially the ABI applies only to POSIX systems and does
not carry over to over to non-POSIX systems. Who cares? :-)
--
<mailto:dietmar_kuehl@yahoo.de> <http://www.dietmar-kuehl.de/~kuehl/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: John Nagle <nagle@animats.com>
Date: Wed, 13 Dec 2000 14:22:19 GMT
Raw View
David Abrahams wrote:
>
> "Martin von Loewis" <loewis@informatik.hu-berlin.de> wrote in message
> news:p6qk899am56.fsf@informatik.hu-berlin.de...
>
> > No, a C++ ABI  specification was completed earlier this year, see
> >
> > http://reality.sgi.com/dehnert_engr/cxx/abi.html
> >
> > specifically section 5.1. That specification will be implemented in
> > g++ 3, and likely in compilers of other companies that participated in
> > drafting the specification (e.g. EDG, HP, SGI).

   That's a sad spec.  They went to all the trouble to define a
new object format requring a new linker, specifically for C++,
yet didn't solve any of the problems left over from C++ implementations
on dumb legacy linkers.

   Specifically:

 If a new linkage mechanism is being defined, why is name
 mangling still being used at all?  There should be a binary
 representation for type data, and the user should see
 diagnostics that look like C++ declarations.

 This spec still calls for generating vtables at compile
 time, which leads to pulling in unreachable member functions.
 A linker designed for C++ should generate vtables at link
 time, with entries only for reachable functions.

 There's no support for folding duplicate instantiations of
 template functions.  Duplicate functions (defined as
 byte-for-byte object code duplicates) should be merged,
 and sufficient information should be preserved that
 debuggers are not confused by this.

   There's an opportunity here to fix some of the linker-related
dirty laundry in C++, but it's being missed.

     John Nagle
     Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: James.Kanze@dresdner-bank.com
Date: Wed, 13 Dec 2000 16:20:55 GMT
Raw View
In article <90u7vn$h0q$4@news.BelWue.DE>,
  dietmar_kuehl@yahoo.com wrote:

> Compiler vendors decided that they first want a C++ standard and
> some implementation experience before going into the trouble of
> defining a C++ ABI. Now that there is a C++ standard, companies have
> at least started to work on a ABIs: SGI defined an ABI for their
> systems (I think I stumbled of this when searching around on the
> SGI/STL site) and basically the resulting ABI is apparently adopted
> with some platform specific changes by other systems, too. I don't
> know, however, whether it is really used in any compilers...

Sun has mentionned several times here that they want to define a
common ABI for all Sparc Solaris compilers.  Intel et al. are defining
a common ABI for all IA-64 based Unix, e.g. SGI's, HP-UX, AIX and
Linux.  Microsoft has patented parts of its ABI in order to make a
common ABI illegal under Windows.

That covers most of the big players, except the Compaq Alpha and the
IBM S390, which I don't know about.

For the most part, such ABI only concern the future, however.  There
seems to be very little interest in fixing a common ABI on existing
systems, possibly because of problems of backward compatibility.

--
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627


Sent via Deja.com
http://www.deja.com/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "Tony" <tony@my.isp.net>
Date: Thu, 14 Dec 2000 20:12:25 GMT
Raw View
"Mike Dimmick" <mike@dimmick.demon.co.uk> wrote in message
news:976316259.26113.2.nnrp-14.d4e5bde1@news.demon.co.uk...
>
> "Tony" <tony@my.isp.net> wrote in message
> news:Zr8Y5.10280$2P3.737629@bgtnsc06-news.ops.worldnet.att.net...
> > It appears that C++ without the 'extern "C"' is severely crippled for
> making
> > cross-tool
> > object compatible modules. Is the trend away from ANY kind of
> compatibility
> > beyond
> > the source code in favor for higher level mechanisms like COM and CORBA?
> Can
> > someone here put the issues in proper perspective?
>
> See for example Gnat ADA's 'pragma C++' directive.  So long as other
> compilers can understand the name mangling technique used by a C++ they
are
> compatible with, such as in this case g++, there is no problem.

You seem to have missed my thought completely. The issue I was alluding to
is that
all vendors do it differently, hence there is no mangling standard, which
means no object
code level compatibility.

Tony


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: remove.haberg@matematik.su.se (Hans Aberg)
Date: Thu, 14 Dec 2000 20:12:59 GMT
Raw View
In article <dillmtlpmcl.fsf@isolde.research.att.com>, Matthew Austern
<austern@research.att.com> wrote:
>There's one other way in which it's possible to be a little bit
>more ambitious than what Barry is suggesting: it's possible to
>have an ABI specification in which some parts are processor-
>specific but other parts of processor independent.  In some sense,
>what this means is that the ABI specification wouldn't define an
>ABI, but rather a family of ABIs, one for each processor that it
>covers.  You still won't get link compatibility between object
>files compiled for different architectures, but the layout will
>be the same (or, at least, will differ in well defined ways).

The C++ problem with not being able to know at all what the underlying
binary structure are shows up in implementing various dynamic structures:

For example, one can implement a polymorphic variable have a class data
with a pointer to an object of a class in a class hierarchy, which is a
"boxed element" because of the pointer. But in order to speed things up a
little, one can let the data class have a unions where elements of small
classes are boxed. Then the problem is that this cannot be used in
connection with multiple inheritance, because C++ does not specify where
the pointer to an object should point in reality: It may point somewhere
in the middle. Thus, if one has a fixed memory location, and kills off one
element, and wants to write over another, it may in reality be written
outside the allowed box.

Other similar problems is that C++ does not give the programmer any
control over how bitfields should be implemented. So it is not possible to
use bitfields to produce binary structures that somehow is exported.

So I think there are things that should somehow should be added to C++
that allows one to have control over these binary structures -- be it a
cross-platform ABI or additions to C++ itself.

  Hans Aberg      * Anti-spam: remove "remove." from email address.
                  * Email: Hans Aberg <remove.haberg@member.ams.org>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Fri, 15 Dec 2000 14:08:01 GMT
Raw View
kuehl@ramsen.informatik.uni-konstanz.de (Dietmar Kuehl) writes:

> Basically the same conventions for name mangling, exception handling,
> object layout, calling conventions, you name it, can be used on other
> platforms, too.

That's true for name mangling and object layout. For calling
conventions (and sizes of primitive types), the C++ ABI defers to the
C ABI (as does it for object file format).

> There are some details which have to be fixed for other platforms
> but basically the ABI can be carried over to other platforms, too.

Almost, by virtue of using the C ABI for the really processor-specific
parts.

Unfortunately, there is one aspect that is really processor-specific
but not covered in most C ABIs, which is exception handling. The
tricky part here is the unwinding process, which needs to operate on
well-established data structures.

> Sure, the object files and certain details of the ABI will be
> platform specific but the basic approach does not have to be
> changed.

The C++ ABI requires a number of features from the object files, like
support for initializer lists, and .hidden symbols. If the platform's
object file format does not support those features, then adopting the
ABI may become non-trivial. On an ELF system, things look pretty good,
though.

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Fri, 15 Dec 2000 14:08:15 GMT
Raw View
John Nagle <nagle@animats.com> writes:

>    That's a sad spec.  They went to all the trouble to define a new
> object format requring a new linker, specifically for C++, yet
> didn't solve any of the problems left over from C++ implementations
> on dumb legacy linkers.

Why do you think so? The object file format is standard ELF, as
defined by the SysV Generic ABI (gABI).

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: comeau@panix.com (Greg Comeau)
Date: Fri, 15 Dec 2000 15:33:23 GMT
Raw View
In article <yN9_5.2134$pi2.147536@bgtnsc07-news.ops.worldnet.att.net>,
Tony <tony@my.isp.net> wrote:
>The issue I was alluding to is that all vendors do it differently,
>hence there is no mangling standard, which means no object
>code level compatibility.

But object code compatibility != mangling alone

- Greg
--
Comeau Computing / Comeau C/C++ "so close" 4.2.44 betas NOW AVAILABLE
TRY Comeau C++ ONLINE at http://www.comeaucomputing.com/tryitout
Email: comeau@comeaucomputing.com / WEB: http://www.comeaucomputing.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: kuehl@ramsen.informatik.uni-konstanz.de (Dietmar Kuehl)
Date: Tue, 12 Dec 2000 14:37:19 GMT
Raw View
001@du150-226.ppp.su-anst.tninet.se> <3A2C3F7B.7DA65E98@wizard.net> <remove.haberg-0512001955250001@du137-226.ppp.su-anst.tninet.se> <3A2EDB72.E0970794@wizard.net> <remove.haberg-0712001956390001@du137-226.ppp.su-anst.tninet.se> <Zr8Y5.10280$2P3.737629@bgtnsc06-news.ops.worldnet.att.net>
X-Newsreader: TIN [version 1.2 PL2]

Hi,
Tony (tony@my.isp.net) wrote:
: It appears that C++ without the 'extern "C"' is severely crippled for making
: cross-tool
: object compatible modules. Is the trend away from ANY kind of compatibility
: beyond
: the source code in favor for higher level mechanisms like COM and CORBA? Can
: someone here put the issues in proper perspective?

Isn't this in the FAQ? Anyway: The standardization committee decided
that the issue binary compatibility between compilers (or their
versions) is subject to platform rules. Thus, it was deliberately left
open with the naming mangling being incouraged to use compiler specific
names for prevention of accidental link successes: The problems of
binary compatibility go far beyond the mere naming of the functions.
Things like the "virtual function table" layout, if there is such a
thing at all, dealing with exceptions and RTTI, variations in library
implementations, class layout, etc. are much more important issues.

Compiler vendors decided that they first want a C++ standard and some
implementation experience before going into the trouble of defining a
C++ ABI. Now that there is a C++ standard, companies have at least
started to work on a ABIs: SGI defined an ABI for their systems (I
think I stumbled of this when searching around on the SGI/STL site) and
basically the resulting ABI is apparently adopted with some platform
specific changes by other systems, too. I don't know, however, whether
it is really used in any compilers...
--
<mailto:dietmar_kuehl@yahoo.de> <http://www.dietmar-kuehl.de/~kuehl/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "David Abrahams" <abrahams@mediaone.net>
Date: Tue, 12 Dec 2000 14:38:11 GMT
Raw View
"Martin von Loewis" <loewis@informatik.hu-berlin.de> wrote in message
news:p6qk899am56.fsf@informatik.hu-berlin.de...

> No, a C++ ABI  specification was completed earlier this year, see
>
> http://reality.sgi.com/dehnert_engr/cxx/abi.html
>
> specifically section 5.1. That specification will be implemented in
> g++ 3, and likely in compilers of other companies that participated in
> drafting the specification (e.g. EDG, HP, SGI).

My impression was that this ABI would only be used for a specific processor
architecture, namely the Intel chip formerly known as "Merced". Was I
mistaken? I hope so!

-Dave


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: James.Kanze@dresdner-bank.com
Date: Tue, 12 Dec 2000 14:40:01 GMT
Raw View
In article <976316259.26113.2.nnrp-14.d4e5bde1@news.demon.co.uk>,
  "Mike Dimmick" <mike@dimmick.demon.co.uk> wrote:

> "Tony" <tony@my.isp.net> wrote in message
> news:Zr8Y5.10280$2P3.737629@bgtnsc06-news.ops.worldnet.att.net...
> > It appears that C++ without the 'extern "C"' is severely crippled
> > for making cross-tool object compatible modules. Is the trend away
> > from ANY kind of compatibility beyond the source code in favor for
> > higher level mechanisms like COM and CORBA? Can someone here put
> > the issues in proper perspective?

> See for example Gnat ADA's 'pragma C++' directive. So long as other
> compilers can understand the name mangling technique used by a C++
> they are compatible with, such as in this case g++, there is no
> problem. The mechanism for C / Fortran (often the C linkage
> convention was made the same as the Fortran convention on a
> particular platform, mainly because Fortran was implemented first)
> happens to be used a lot because it's simple. The linkage convention
> required by C++ generally looks odd because

> a) it has to be compatible with C / Fortran linkers

> b) it has to encode all the type information of the function,
> including classes, namespaces, and templates.

> An example:

??0?$basic_filebuf@DU?$char_traits@D@std@@@std@@QAE@W4_Uninitialized@1@@
Z is
> Microsoft's link name for

> std::basic_filebuf<char,struct std::char_traits<char>
> >::basic_filebuf<char,struct std::char_traits<char> >(enum
> basic_filebuf<char,struct std::char_traits<char> >::_Uninitialized)

> (I chose that at random from MSVCP60.DLL, one of the files which
> implements the standard version of iostreams). Annoyingly, for tools
> vendors, Microsoft chooses not to document their C++ name mangling
> convention.

> I believe that g++ generates names of the form used originally by
> Stroustrup as described in the ARM, where the function name comes
> first, followed by '__', then an F, then the types.

It's similar, but there are small differences.  In general, early
compilers intentionally name mangled differently.  Since the layouts
weren't compatible, it was felt preferrable to get a link error than
to have subtle semantic errors in the program due to calling the wrong
function, etc.

> I won't go into
> the scheme fully. Many other compilers that were originally derived
> from AT&T's Cfront use the same technique. I do note, however, that
> it uses legal identifier names, so there is a chance that the user's
> chosen names will inadvertantly clash with the link names of some
> library with which they wish to link. Not a great one, admittedly,
> but the chance is there.

They don't use legal user names.  All of the mangled names with CFront
or g++ begin with two underscores; no user symbol can begin with two
underscores.  And it is up to the implementer to ensure that there can
be no clashes with the implementation library.

> Constraint a) was chosen so that a port of C++ to a new platform
> would be easy; it could be done without requiring a new linker to be
> written. Constraint b) is required in the face of overloading.

> The problem is, of course, that C is not an object-oriented language
> and therefore doesn't understand the existence of, specifically,
> member functions. Nor does it have type-checked linkage, because it
> doesn't have function overloading. Nor does it have namespaces.
> Ditto Fortran.[3]

> Standard C++ does not restrict the form of linkage conventions.  Nor
> the number.  Microsoft's C++ compiler (for x86) supports four
> calling conventions, each of which are decorated in their link
> information in different ways:

> __cdecl    For functions which use '...' in a parameter [1].
>            In this convention, link names are prefixed with a _

> __stdcall  The 'standard' calling convention.  Link names prefixed
with _
>            and suffixed with an @ and the size of the arguments.

> __fastcall First two arguments passed in registers EDX, ECX
respectively.
>            Link names prefixed with @ and suffixed as per __stdcall.

> __thiscall Used for class member functions.  Full C++ decoration.
>            Allows use of '...'  'this' is passed in ECX.

> The name exported can also be adjusted using the EXPORTS file with
> the linker on the Win32 platforms.

> I would prefer it had these link/call conversions been specified as
> 'extern "cdecl"' etc, but that would have required a change in the
> method used by Microsoft's C compiler.  Because header files are
> shared, the link conventions must be the same. [2]

> The only reasonable solution to this problem is to break out of the
> old smart compiler, dumb linker cycle (that is
> compile-to-object-code, link-object-code).  This will probably come
> as people try to implement template export, a feature which
> practically mandates either re-running the compiler, or leaving
> object code generation to a post-link stage.

Rerunning the compiler used to be the standard procedure for
instantiating templates anyway.

> For example, the
> traditional method is:

> [source code] -> compiler -> [object code] -> linker -> [executable]
->
> execute

> (there may be dynamic link stages post-execution on the user's
> computer, but that is ignored)

> In the main, though, compilers are written two-pass - once to build
> a syntax tree, once to produce the object code.  This facilitates
> portability of the compiler itself; to port to a new platform, the
> code generator must be rewritten but the remainder of the compiler
> can stay intact.

Most compilers will also optionally insert a global optimization pass
between the two passes, and a peephole optimizer after the code
generation phase.  At least one compiler adds a final optimization
phase after link (and after profiling data is available) in order to
inline accross module boundaries, etc.

> One method of template export compilation, with minor changes to
> existing tools, would be:

> [source code] -> compiler -> [object code with template source
embedded]
>     -> linker -> [template source] -> compiler -> linker ->
[executable]
>     -> execute

> It would probably, however, be more sensible to give the linker far
> more smarts, or defer code generation to link time, that is:

> [source code] -> compiler -> [intermediate representation] -> linker
->
> [linked intermediate rep] -> code generator -> [executable] ->
execute.

> Decoupling the code generator from the compiler front end is
> essentially much of what Java and .NET are all about.

True.  That certainly accounts for why Java is so slow to start up,
and thus unusable for short jobs like typical Unix filter programs.

Just systematically shifting final code generation off to the linker
will result in unacceptable link times -- each time I link, I have to
regenerate all of the code, even though I've only modified one line in
one function.  Some sort of real smarts will be needed, to cache the
generated code, and only regenerate the parts that have changed.

An additional problem is that templates require the instantiation
context, which must also be transmitted to the linker somehow.  And
the template must be reinstantiated any time anything changes in the
template definition OR the instantiation context.

--
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: Barry Margolin <barmar@genuity.net>
Date: Tue, 12 Dec 2000 16:37:05 GMT
Raw View
In article <05d701c0625c$096178b0$0500a8c0@dragonsys.com>,
David Abrahams <abrahams@mediaone.net> wrote:
>"Martin von Loewis" <loewis@informatik.hu-berlin.de> wrote in message
>news:p6qk899am56.fsf@informatik.hu-berlin.de...
>
>> No, a C++ ABI  specification was completed earlier this year, see
>>
>> http://reality.sgi.com/dehnert_engr/cxx/abi.html
>>
>> specifically section 5.1. That specification will be implemented in
>> g++ 3, and likely in compilers of other companies that participated in
>> drafting the specification (e.g. EDG, HP, SGI).
>
>My impression was that this ABI would only be used for a specific processor
>architecture, namely the Intel chip formerly known as "Merced". Was I
>mistaken? I hope so!

By definition, an ABI is specific to a particular processor, since it
specifies how the object code should be generated.  Since it makes no sense
to link together object files generated for different processors, there's
no need for an ABI to be cross-platform.  Usually an ABI is specific to a
particular combination of processor and OS, since it also specifies how the
program makes use of OS services (e.g. how to trap into the kernel).

--
Barry Margolin, barmar@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "Tony" <tony@my.isp.net>
Date: Fri, 8 Dec 2000 17:06:27 GMT
Raw View
It appears that C++ without the 'extern "C"' is severely crippled for making
cross-tool
object compatible modules. Is the trend away from ANY kind of compatibility
beyond
the source code in favor for higher level mechanisms like COM and CORBA? Can
someone here put the issues in proper perspective?

Tony

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: "Mike Dimmick" <mike@dimmick.demon.co.uk>
Date: Sun, 10 Dec 2000 00:10:51 GMT
Raw View
"Tony" <tony@my.isp.net> wrote in message
news:Zr8Y5.10280$2P3.737629@bgtnsc06-news.ops.worldnet.att.net...
> It appears that C++ without the 'extern "C"' is severely crippled for
making
> cross-tool
> object compatible modules. Is the trend away from ANY kind of
compatibility
> beyond
> the source code in favor for higher level mechanisms like COM and CORBA?
Can
> someone here put the issues in proper perspective?

See for example Gnat ADA's 'pragma C++' directive.  So long as other
compilers can understand the name mangling technique used by a C++ they are
compatible with, such as in this case g++, there is no problem.  The
mechanism for C / Fortran (often the C linkage convention was made the same
as the Fortran convention on a particular platform, mainly because Fortran
was implemented first) happens to be used a lot because it's simple.  The
linkage convention required by C++ generally looks odd because

a) it has to be compatible with C / Fortran linkers

b) it has to encode all the type information of the function, including
classes, namespaces, and templates.

An example:
??0?$basic_filebuf@DU?$char_traits@D@std@@@std@@QAE@W4_Uninitialized@1@@Z is
Microsoft's link name for

std::basic_filebuf<char,struct std::char_traits<char>
>::basic_filebuf<char,struct std::char_traits<char> >(enum
basic_filebuf<char,struct std::char_traits<char> >::_Uninitialized)

(I chose that at random from MSVCP60.DLL, one of the files which implements
the standard version of iostreams).  Annoyingly, for tools vendors,
Microsoft chooses not to document their C++ name mangling convention.

I believe that g++ generates names of the form used originally by Stroustrup
as described in the ARM, where the function name comes first, followed by
'__', then an F, then the types.  I won't go into the scheme fully.  Many
other compilers that were originally derived from AT&T's Cfront use the same
technique.  I do note, however, that it uses legal identifier names, so
there is a chance that the user's chosen names will inadvertantly clash with
the link names of some library with which they wish to link.  Not a great
one, admittedly, but the chance is there.

Constraint a) was chosen so that a port of C++ to a new platform would be
easy; it could be done without requiring a new linker to be written.
Constraint b) is required in the face of overloading.

The problem is, of course, that C is not an object-oriented language and
therefore doesn't understand the existence of, specifically, member
functions.  Nor does it have type-checked linkage, because it doesn't have
function overloading.  Nor does it have namespaces.  Ditto Fortran.[3]

Standard C++ does not restrict the form of linkage conventions.  Nor the
number.  Microsoft's C++ compiler (for x86) supports four calling
conventions, each of which are decorated in their link information in
different ways:

__cdecl    For functions which use '...' in a parameter [1].
           In this convention, link names are prefixed with a _

__stdcall  The 'standard' calling convention.  Link names prefixed with _
           and suffixed with an @ and the size of the arguments.

__fastcall First two arguments passed in registers EDX, ECX respectively.
           Link names prefixed with @ and suffixed as per __stdcall.

__thiscall Used for class member functions.  Full C++ decoration.
           Allows use of '...'  'this' is passed in ECX.

The name exported can also be adjusted using the EXPORTS file with the
linker on the Win32 platforms.

I would prefer it had these link/call conversions been specified as 'extern
"cdecl"' etc, but that would have required a change in the method used by
Microsoft's C compiler.  Because header files are shared, the link
conventions must be the same. [2]

The only reasonable solution to this problem is to break out of the old
smart compiler, dumb linker cycle (that is compile-to-object-code,
link-object-code).  This will probably come as people try to implement
template export, a feature which practically mandates either re-running the
compiler, or leaving object code generation to a post-link stage.  For
example, the traditional method is:

[source code] -> compiler -> [object code] -> linker -> [executable] ->
execute

(there may be dynamic link stages post-execution on the user's computer, but
that is ignored)

In the main, though, compilers are written two-pass - once to build a syntax
tree, once to produce the object code.  This facilitates portability of the
compiler itself; to port to a new platform, the code generator must be
rewritten but the remainder of the compiler can stay intact.

One method of template export compilation, with minor changes to existing
tools, would be:

[source code] -> compiler -> [object code with template source embedded]
    -> linker -> [template source] -> compiler -> linker -> [executable]
    -> execute

It would probably, however, be more sensible to give the linker far more
smarts, or defer code generation to link time, that is:

[source code] -> compiler -> [intermediate representation] -> linker ->
[linked intermediate rep] -> code generator -> [executable] -> execute.

Decoupling the code generator from the compiler front end is essentially
much of what Java and .NET are all about.

More discussion on this topic is probably more appropriate for
comp.compilers.

--
Mike Dimmick

[1] I know this is a generalisation; the compiler defaults to __cdecl.
However, I tend to use __stdcall because it produces smaller code.  Smaller
code = less pages = less page faults, in general.  It's only required to use
__cdecl when the called function cannot know how big its argument stack is.
That only happens in C++ when you use '...' in an argument list.

[2] The link conventions must be specified because the user may change the
default, as I often do (see [1]).  If the user does change the default, and
the headers for any pre-compiled library did not have the link conventions
specified, the link would fail.  It's not totally true that they must be the
same; the standard library's headers already include a 'namespace std {
extern "C" {' wrapper around the implementation for the C library headers.

[3] The latest versions of these languages may have these - I don't follow
Fortran, and the C99 standard (ISO/IEC 9899:1999) has only just been
released and is $18 for an electronic copy.  But I'm speaking of the
versions that most people mean when they refer to these languages, that is,
C89 (ISO/IEC 9899:1989) and Fortran77.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]





Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Sun, 10 Dec 2000 00:11:12 GMT
Raw View
"Tony" <tony@my.isp.net> writes:

> It appears that C++ without the 'extern "C"' is severely crippled
> for making cross-tool object compatible modules. Is the trend away
> from ANY kind of compatibility beyond the source code in favor for
> higher level mechanisms like COM and CORBA?

No, a C++ ABI  specification was completed earlier this year, see

http://reality.sgi.com/dehnert_engr/cxx/abi.html

specifically section 5.1. That specification will be implemented in
g++ 3, and likely in compilers of other companies that participated in
drafting the specification (e.g. EDG, HP, SGI).

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]