Thread

Topic: Multidimensional arrays
Author: shen@lrz-muenchen.de (Mok-Kong Shen)
Date: 1995/06/30 Raw View
Hello,

I have just submitted via e-mail to the X3 Secretariat a comment on
the forthcoming C++ standard concerning multidimensional arrays.
Since I am convinced that the issue is essential, I am taking the
liberty of sending you a copy of my comment so that you may like
to send you own comments to the X3 Secretariat in case you have
better or differing ideas on that topic.  May I also remind you that
comments have to be sent for receival by the Secretariat before
July 6 and that a signed hardcopy is also required.

Sincerely yours,
M. K. Shen


------


Comment on ISO/IEC CD 14882
---------------------------

Subject:  Multidimensional Arrays (8.3.4)

Abstract:  The C++ multidimensional arrays are inferior to those of e.g.
Fortran and thus need to be improved for the language to gain wider
acceptance in the fields of engineering and scientific numerical
computations hithertofore absolutely dominated by Fortran.  It is
suggested that a new data type be added to the C++ standard for that
purpose.

The multidimensional arrays of C++ are the same as those of C.  The
defect of this data type lies in the inconvenience/inefficiency when a
multidimensional array is passed to a subprogram which is written to
handle an array of not fixed but arbitrary size.  Thus a Fortran
subprogram of the type

      subroutine sub(m, n1, n2)
      real m(n1,n2)
      m(1,2) = 3
      . . . . . . . . . .  . .
      end

cannot be simply transcribed into C/C++.  This problem is well-known.
Stroustrup [1] shows that a subprogram to print an arbitrary integer
matrix has to be written in an obscure (word his) way using expression
((int *) m)[i*dim2 + j] instead of m[i][j].  Further, his example
subprogram has to be called with print_mij((int **) m, n1, n2) instead
of the more natural form print_ij(m, n1, n2) expected by the user.  The
negative software engineering consequence of this needs no arguing.  In
my personal opinion this 'artificial' complexity is one of the major
psychological factors hindering most of Fortran programmers from being
friends of C/C++.

In practice one overcomes this problem by using pointer arrays.  Thus
Press et al. [2] employ special subprograms that allocate pointer
arrays referencing the individual rows of matrices such that in the
other C subprograms provided by these authors the familiar notation
m[i][j] can be used.  In C++ one can write a matrix class with operator
definition to allow simple subscripting of matrix elements while at
the same time making the essential matrix operations available.  This
is fine.  However, in situations where the class library is not
available or cannot be used for portability or other reasons and the
programmer has to write all himself, the problem remains.  Moreover,
using such a matrix class is essentailly employing pointer arrays
behind the scene and thus can incur loss of machine efficiency.
Disregarding the efficiency issue arising from calling the operator
function [] of such a class, the efficiency loss can be measured by
comparing the computing time of using m[i][j] versus pt[i][j] where
pt[i] points to m[i][0].  Multiplying two matrices of size 300*300
I obtained a cpu time ratio of 1 : 1.11 on an IBM RISC 6000.  This
overhead of more than 10% must certainly be regarded as very
significant by those who constantly strive to optimize their code in
essential numerical applicatons.

Therefore I propose that in addition to the currently existing
multidimensional array data type of C/C++ there be introduced into the
C++ standard a new array data type that is akin in functionality to
that of the other major programming languages used in numerical
computations.  By adding a keyword 'array' one could define e.g.

      double array m[300][300];

for passing to a subprogram of the type

      void sub(double array [][], int n1, int n2)

to be referenced inside the subprogram with the m[i][j] notation.
Preferrably there should also be inquiry functions to determine the
extent of each dimension of a multidimensional array so that the values
n1 and n2 above need not be passed through the parameter list.

It may be noted that the proposed addition does not affect/break
existing C/C++ code, hence no question of compatibility can arise.
There is further no problem of implementation as the proposed data type
is long present in a number of major programming languages.  Therefore
the adoptation of this proposal should be easy.

Literature:

[1]  Bjarne Stroustrup, The C++ Programming Langauge.  2nd ed.,
     p. 128-129. Addison-Wesley, 1991.

[2]  William H. Press et al., Numerical Recipes in C.  2nd ed.,
     p. 20-23. Cambridge University Press, 1992.

Submitter of comment:

Mok-Kong Shen
(signed) June 30, 1995

Postal address:                  E-mail:
Postfach 340238                  shen@lrz-muenchen.de
D-80099 Muenchen                 (invalid after August 30, 1995)
Germany