Topic: overhead of virtual functions (was C++ vs Java, Efficientcy and Concept ?)
Author: "Martijn Lievaart" <nobody@orion.nl>
Date: 1998/12/15 Raw View
[This gets a bit of-topic for this ng. let's either try to end this thread
quickly or redirect it to another newsgroup. It *is* interresting, that's
why I continue it now, even if it is a bit off-topic]
Larry Brasfield wrote in message ...
>
>Martijn Lievaart wrote in message <752pj2$mn4@news3.euro.net>...
>>
>>AllanW@my-dejanews.com wrote in message
<74p1ec$mkj$1@nnrp1.dejanews.com>...
>>>
>[snip]
>>>Quite a long time ago, Dr. Stroustroup and others published one
>>>method to keep the overhead from virtual functions very slight.
>>>On many computers the overhead is as little as one memory fetch.
>>>As far as I know, the technique he suggested is used in all C++
>>>compilers everywhere, with minor variations.
>>>
>>
>>I would say two fetches. One to get the vtable pointer from the object and
>>one to get the address of the virtual function.
>
>When a non-virtual function is called, the function address is
>usually inline in the instruction stream and must be fetched
>to be used. So the the "overhead", or "additional burden"
>of the virtual call relative to a plain call is just fetching the
>v-table pointer from the object, when counted in fetches.
>
I don't agree. For the vtable fetches, the offset in the vtable is embedded
in the instruction stream, this offsets the address fetch in your example.
It is true that this offset is mostly small and many architectures can
handle this more efficently, but in general the overhead is two fetches I
think.
But if we get down to this level, there is also instruction fetches. These
are handled very efficiently from cache, but virtual functions take more
instructions to execute, therefore even more fetches. And these fetches are
exactly as efficient as the addres of the direct jump fetch, or the vtable
offset fetch. Besides, the code gets bigger, so you get slightly worse cache
behaviour.
Let's take a look at some pseudo assembler. This will not resemble *any*
real machine, it is just an example.
jump xxxxxxxx
--> two fetches
versus:
; assume object pointer stored in r7, vptrs at offset 0 of object and not
yet loaded
load r1, [r7]
add r1, offset
load r1, [r1]
jump r1
--> Something like 7 fetches
(CISC may be more efficient here).
So there is a definate price to pay. But would *you* notice the overhead? Is
it a problem for the program *you* are working on *now*? Only profiling and
experimenting will tell, but most of the time it isn't.
>>A function pointer stored in a struct would only be one fetch, but this is
>>very space inefficient in general, so the compromise of one fetch extra
>>versus enormous space savings was chosen for all implementations I'm aware
>>off.
>>
>>Also, I heard that indirect function calls are slow on some architectures.
>
>I expect that is true of most modern CPU's. Instruction
>prefetch and speculative execution would have to be
>hindered when the instruction addresses appear to
>be computed. I think it is obvious that an inline call
>has but one, thoroughly predictable effect on what
>instructions will execute, while indirection thru two
>pointers is harder to predict, in general. (I suppose
>the v-table case could be optimized, so as to allow
>the usual prefetch and speculative execution.)
>
Look at the code above. The jump r1 is dependent on the instruction above
it. This naive implementation will create a horrible pipeline stall. Real
compilers will probably do better.
>>If this is true then the cost of making something virtual
>>that doesn't have to be can be even higher.
>
>Another case in support of measurement for
>performance questions.
>
Absolutely.
Martijn
--
My email address is intentionally invalid,
reply to mlievaar at orion in nl
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]