Topic: F90 vs C++/C, optimization question?
Author: wclodius@lanl.gov (William B. Clodius)
Date: 1995/10/26 Raw View
A few comments on comments on this question, mostly derived from David
Chase's remarks.
1. Legal Fortran code is allowed to have arguments aliased with one
another and global (common and module) variables. Legal code cannot have
the aliased variables be modified.
2. It should be noted that Fortran is not the only language where
optimizers are explicitly allowed to assume no aliasing. Ada, which is
generally considered to have some of the safest semantics of any commonly
used language, has the same semantics.
3. Having the semantics of the code well defined as written, does not mean
that the semantics are well defined for the user of the code. As a silly
example, a C routine written to reverse the order and interchange two
arrays of the same size may have a well defined result to the writter of
the code, but if a hypothetical (foolish) user attempts to use this code to
reverse the order of an array by using pointers to the array as both
arguments to the routine, the user on completion may find that the array is
not reversed, or the first half of the array is replaced by the second half
of the array and the second half is unchanged, or ..., depending on the
details of the algorithm. In this case information hiding, the goal of
most modern languages, bites the foolish user. Similar problems can and
will occur for more realistic examples. Every user of code, in any
language, should be extremely careful in using aliased arguments that can
be modified.
William B. Clodius Phone (505) 665-9370
Los Alamos Natl. Lab. NIS-1 FAX (505) 665-7395
PO Box 1663, MS-D466 Group Office (505) 667-2701
Los Alamos, NM 87545 email wclodius@lanl.gov
NIS-1: Nonproliferation and International Security Div./
Space and Atmospheric Physics Group
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
Author: bill@amber.ssd.hcsc.com (Bill Leonard)
Date: 1995/10/27 Raw View
In article <46kuem$j45@manuel.anu.edu.au>, keagchon@mehta.anu.edu.au (Geoff Keating 9205156) writes:
> Hmmm. Suppose you claim that X[20][20] and Y[20] are not aliases of
> each other. What's wrong with:
>
> assert(&Y[20] <= &X[0][0] && X[20][20] >= Y[0]);
>
> (apart from the obvious fact that it's not portable)?
The fact that it's not portable. :-)
Seriously, this check is far too simplistic for most vectorizing and
parallelizing compilers. Most vectorizing compilers can easily deal with
arrays that overlap, as long as the compiler *knows* what the overlap
characteristics are. For example, if X[0] maps to Y[4], X[1] to Y[5], and
so on, then the compiler may be able to perform parallel operations in
4-element groups with no problem.
> In fact, couldn't a compiler take some loop:
>
> for (...)
> for (...)
> X[...] = Y[..] * X[...];
>
> and turn it into:
>
> if (X and Y are aliased)
> for (...)
> for (...)
> X[...] = Y[..] * X[...];
> else
> some_neat_vectorop(X,Y);
The whole problem is how to implement that if statement.
This discussion seems to me somewhat moot anyway. The main reason that
Fortran 90 added lots of array manipulation syntax and intrinsic functions
is to substantially reduce the necessity for compilers to figure out what
array operation is being done by a loop. Unless and until C or C++ add
such higher-level array operations, Fortran 90 is probably always going to
give better performance than C/C++ in the realm of array manipulation,
regardless of the aliasing issues.
--
Bill Leonard
Harris Computer Systems Corporation
2101 W. Cypress Creek Road
Fort Lauderdale, FL 33309
Bill.Leonard@mail.hcsc.com
These opinions and statements are my own and do not necessarily reflect the
opinions or positions of Harris Computer Systems Corporation.
------------------------------------------------------------------------------
I feel like I've been sent for and couldn't go, and weren't needed after I
got there.
------------------------------------------------------------------------------
---
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
Author: wclodius@lanl.gov (William B. Clodius)
Date: 1995/10/30 Raw View
bill@amber.ssd.hcsc.com (Bill Leonard) said:
> This discussion seems to me somewhat moot anyway. The main reason that
> Fortran 90 added lots of array manipulation syntax and intrinsic functions
> is to substantially reduce the necessity for compilers to figure out what
> array operation is being done by a loop. Unless and until C or C++ add
> such higher-level array operations, Fortran 90 is probably always going to
> give better performance than C/C++ in the realm of array manipulation,
> regardless of the aliasing issues.
This was one of two reasons for adding the syntax and functions, the other
was to enable users to write more legible and maintainable code. It should
be noted, however, that while these additions on parallel machines are
often translated to efficient code, the type of analysis required to
provide efficient code is not typical of the optimization analysis commonly
performed on scalar or pipelined processors. As a result most Fortran 90
compilers have followed the following path
1. Version 1 implement what you think is correct, with virtually all F77
optimizations, but only the simplest translation of F90 features.
2. Version 1.1 (six months to 1 year later) correct all obvious incorrect
interpretations of the standard and corrections to the standard by the ANSI
committee. Optimize the instrinsic functions. Perhaps iterate on versions
1.2, etc.
3. Version 2.0 (one and a half to three years after Version 1) implement
what you think are correct optimizations to the F90 syntactal additions to
the standard.
Because most F90 compilers came out after late 93, very few F90 compilers
properly optimize array syntax, and some other F90 additions, e.g.,
recursion and POINTERs. Indeed, the earliest F90 compiler, because it
presently compiles with C as its intermediate language, still cannot fully
optimize many aspects of F90. The array syntax has been more efficient in
achieving its second goal then its first goal.
Relative performance (F90 vs. C and C++) at this time is typically
dominated by the aliasing issue. It is very difficult for a compiler to
know the overlap characteristics in the presence of aliasing.
I am under the impression that the C++ draft standard incorporates a
limited, 1-D only, version of array syntax. Has that been implemented in
any compliers?
William B. Clodius Phone (505) 665-9370
Los Alamos Natl. Lab. NIS-1 FAX (505) 665-7395
PO Box 1663, MS-D466 Group Office (505) 667-2701
Los Alamos, NM 87545 email wclodius@lanl.gov
NIS-1: Nonproliferation and International Security Div./
Space and Atmospheric Physics Group
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
Author: chase@centerline.com (David Chase)
Date: 1995/10/19 Raw View
> In article <45o2n2$o48@gabi.gabi-soft.fr>, kanze@gabi-soft.fr (J. Kanze) writes:
> > There is a proposed extention to C which would do it, a sort of son of
> > noalias. I don't know any of the details, but I'm sure that a question
> > in comp.std.c would reach people who did, or who could point you in the
> > right direction.
In article dkd@hawk.hcsc.com, bill@amber.ssd.hcsc.com (Bill Leonard) writes:
> I continue to think such extensions are a bad idea. They basically force
> the compiler to trust the programmer, and historically that is not a
> reliable thing to do.
I'm not sure that I agree with this. I haven't been following the
proposals, but if the annotations appear in the function declaration
(like const) then I think it ok. The reason for this is that it
proclaims, loud and clear, to clients:
these arrays had better not overlap.
and
future versions of this code (i.e., recompilations) reserve the right
to access these array elements in a different order than they do now.
or (rephrased)
the order in which the elements of the arrays are referenced is an
implementation detail; clients should not rely on this.
After all, *I* am perfectly capable of rewriting matrix multiplication
in a pipelined, register-blocked, cache-blocked style, and any client
who relied on weird aliasing properties of my original subroutine will
find his code just as broken as if the compiler had rewritten it.
> ... The original programmer may have been careful to
> avoid aliasing, but what about the 50 other programmers who came later and
> made modifications to the program? Worse still, you may have a bug lurking
> there for decades until you happen to compile with a compiler that takes
> advantage of the noalias declaration *and* does it in such a way that the
> illegal aliasing changes the behavior of the program. Such bugs are
> extremely difficult to track down.
True, but why should that matter to users of C or C++ :-). If I wanted
reliable code, I'd be using Modula-3, or Ada, or something like that.
I have a hard time (as do some compilers, apparently) just making sense
of my syntax errors in C++. From my point of view, they've messed up
the language this much in the pursuit of efficiency, why stop now?
Note, too, that when this one wins, it wins big. Automatic
vectorization and parallelization and cache-access optimizations can
yield (I have observed, so have customers of former employers -- in one
case, a factor of 18) integer-factor performance increases. Ask
yourself what other current feature in C++ enables such a speed boost,
under any circumstances? Of the things that were left out (e.g.,
garbage collection) how did the perceived (better yet, measured) costs
compare with what is being given up here? What's the very worst
slowdown anyone using the Boehm-Weiser collector has ever observed in a real
application? I'll bet it's not even a factor of two.
> IMHO, the best answer is improved interprocedural analysis by compilers,
> because that is the safest and also the most optimal -- it detects aliasing
> when it is present, and detects lack of aliasing when it is absent (even if
> the programmer didn't know whether aliasing was present or not).
Virtual functions? Decent results (in pointerful languages like C) are just
now emerging from research land. C++, what with its virtual functions and
virtual base classes, is an additional can of worms. I'd love to be working
on this problem (I've worked on it in the past) but I wouldn't bet the farm
on it.
speaking for myself,
David Chase
---
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
Author: vandevod@cs.rpi.edu (David Vandevoorde)
Date: 1995/10/21 Raw View
>>>>> "DC" == David Chase <chase@centerline.com> writes:
[ This is about the `restrict' keyword; see somewhere around:
http://www.lysator.liu.se/c/index.html
for details ]
DC> In article dkd@hawk.hcsc.com, bill@amber.ssd.hcsc.com (Bill Leonard) writes:
>> I continue to think such extensions are a bad idea. They basically force
>> the compiler to trust the programmer, and historically that is not a
>> reliable thing to do.
DC> I'm not sure that I agree with this. I haven't been following the
DC> proposals, but if the annotations appear in the function declaration
DC> (like const) then I think it ok. The reason for this is that it
They do, and (like const) they're not restricted to function parameters.
[...]
DC> Note, too, that when this one wins, it wins big. Automatic
DC> vectorization and parallelization and cache-access optimizations can
DC> yield (I have observed, so have customers of former employers -- in one
DC> case, a factor of 18) integer-factor performance increases. Ask
For this kind of improvement, you need quite specialized hardware and
applications (which happen to be important and the no. 1 target of the
added concept, I guess). OTOH, even for your regular workstation,
the improvement can be quite worthwile (say a few 10%).
[...]
>> IMHO, the best answer is improved interprocedural analysis by compilers,
>> because that is the safest and also the most optimal -- it detects aliasing
>> when it is present, and detects lack of aliasing when it is absent (even if
>> the programmer didn't know whether aliasing was present or not).
Although this type of techniques is improving continuously it is far
from sufficient. In particular, it usually fails when some source code
is unavailable.
[...]
Daveed
---
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
Author: vandevod@cs.rpi.edu (David Vandevoorde)
Date: 1995/10/17 Raw View
>>>>> "BL" == Bill Leonard <bill@amber.ssd.hcsc.com> writes:
[...]
James Kanze:
>> There is a proposed extention to C which would do it, a sort of son of
>> noalias. I don't know any of the details, but I'm sure that a question
>> in comp.std.c would reach people who did, or who could point you in the
>> right direction.
BL> I continue to think such extensions are a bad idea. They basically force
BL> the compiler to trust the programmer, and historically that is not a
BL> reliable thing to do.
[...]
Note that James was not talking about `noalias', but about `restrict'.
I don't doubt restricted pointers can be misused or abused and hence
lead to trouble; so can many features of C and C++. I'm not sure I
totally with your view of history though: trusting the programmer is
often pragmatically essential.
Scientific computing in C++ could sure use the extra cycles...
Daveed
---
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
Author: jarausch@igpm.rwth-aachen.de (Helmut Jarausch)
Date: 1995/10/17 Raw View
[Moderator's note: I deleted about 50 lines of quoted material
from this article. -fjh.]
bill@amber.ssd.hcsc.com (Bill Leonard) writes:
[...]
> IMHO, the best answer is improved interprocedural analysis by compilers,
> because that is the safest and also the most optimal -- it detects aliasing
> when it is present, and detects lack of aliasing when it is absent (even if
> the programmer didn't know whether aliasing was present or not).
This is all true, BUT there are and will ever be situations where it is
impossible for a machine to figure out if there is aliasing or not.
I remember a case where I had to tell the CRAY FORTRAN compiler (via
a special comment) to vectorize a loop. It took me and a collegue of
mine nearly an hour to prove that there is no aliasing. But on the other
hand it was crucial to get full speed in that loop.
If it is impossible or hard to get good performance with C++ it will
die out in the scientific computing world, just because there are very
few things missing. If you are afraid, you can let the compiler print
tons of warnings whenever such a construct is used. But there is need
for a PORTABLE construct and not just a vendor specific #pragma.
Helmut Jarausch
Institute of Technology
RWTH Aachen
Germany
---
[ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]