Submitted for publication in Computer Physics Communications, Nov.
1996
Available from Los Alamos National Laboratory as Report No.
LA-UR-96-4064.
This paper compares Fortran 90 [4,5] with C++ [6,7] for these purposes. We look at the degree to which the syntax of these two languages supports important object-oriented programming (OOP) features, as well as some useful non-OOP features, and how these features can strengthen and simplify the programming of scientific application codes. (Here we shall assume little familiarity with either F90 or C++ and introduce some basic syntax features of each language; the reader is encouraged to consult the reference materials for more details.) Object-oriented programming allows one to have the modularity and data integrity needed for ease of code expansion. The capabilities of C++ in this area are fairly well established, while the use of F90 for OOP is relatively new [8]. We will introduce examples from physical problems to illustrate how object methods can be useful for scientific programing. In addition, we will provide performance comparisons of these languages and the procedural languages F77 and C on several typical computational kernels from scientific application codes.
We shall see that all the desired object-oriented features, including the important notion of inheritance, are present in C++. In contrast, F90 does not have inheritance, which is invaluable in providing opportunities for code reuse, and has no single construct for representing an abstract object. The template feature of C++, which similarly allows for code reuse and is an important means of optimizing the evaluation of complex expressions involving user-defined types, is also absent from F90. On the other hand, F90 offers a built-in array type with simple array syntax for mathematical operations, plus the ability to specify numerical precision in computations. Performance comparisons indicate that, contrary to popular myth, there is no significant performance advantage gained by using a procedural language instead of one of these newer languages with object-oriented features.
Object-oriented languages have the property of inheritance. This means that an object can inherit the data and methods of its ``parent'' object(s), and it can have additional properties defined. One can think of inheritance as a means of taking a general sort of object and tailoring it to fit specific needs. Inheritance offers tremendous potential for code reuse, and it is helpful in organizing the modules that compose a particular application into a hierarchy that indicates their relation to one another. A sensible hierarchy can be a great aid in managing the complexity of modern scientific computing application codes.
Object-oriented languages also often allow for polymorphism, the ability of a function to have differing behavior depending on the type of object invoking the function or the specific parameter list provided to the function. The former case involves objects that redefine the behavior of an inherited method. The function is dynamically bound (at run time) to the object, with each object implicitly knowing its type and thus its defined behavior. In the latter case, functions utilize knowledge of the parameter list to infer which behavior is desired. Polymorphism enables programmers to avoid writing inflexible, high-maintenance code in which objects must contain every possible behavior and then use large IF-ELSE or switch code blocks to determine the desired behavior at run time.
Before examining how well these features are supported in Fortran 90 and C++, a brief discussion of the difference between abstract and concrete objects is in order. By an abstract object, we mean a set of data structures and methods that describe the properties of the thing we wish to model with our object. An example is an abstraction for a particle. Perhaps it would be described by its mass and a pair of 3-vectors for its position and velocity. Among its methods might be functions for advancing the particle in phase space or computing the particle's kinetic energy. In addition, we may require a construction or initialization function that describes how to create an instance of one of these objects. Such an instance is what we refer to as a concrete object, since it actually exists, has specific values for its data, and is not merely a conceptual description.
class Particle {
private:
double mass;
double position[3], velocity[3];
public:
Particle(double imass=0.0) {
mass = imass;
for (int i=0; i<3; i++) {
position[i] = 0.0;
velocity[i] = 0.0;
}
}
virtual ~Particle() { }
virtual double Position(int i) const { return position[i]; }
virtual double Velocity(int i) const { return velocity[i]; }
virtual double Kinetic_Energy() const {
double ke = 0.0;
for (int i=0; i<3; i++) ke += velocity[i]*velocity[i];
return mass*ke;
}
virtual double Charge() const { return 0.0; }
};
defines a class Particle containing the data described earlier. The constructor, Particle(double), initializes this data and includes a default value of 0.0 for the mass, in case no argument is provided. (We also include the destructor, ~Particle(), which in this case does nothing, but is typically used to free up storage space allocated for internal data by the constructor, once the object is no longer needed.) Because the member data is declared private, code outside of the class cannot access it directly. Instead, the i_th component of the velocity, for example, is obtained using the method Velocity(int i). Similarly, the particle's kinetic energy is obtained via a member function that performs a simple computation.
This is an illustration of encapsulation: the internal data is accessed only through fixed interfaces. If later class Particle is redesigned to use momentum instead of velocity for its internal representation, no code calling the member functions of this class need be changed. Only the appropriate member function definitions would require alteration. In this case, they would become
class Particle {
private:
...
double position[3], momentum[3];
public:
...
virtual double Velocity(int i) const {
return momentum[i]/mass;
}
virtual double Kinetic_Energy() const {
double ke = 0.0;
for (int i=0; i<3; i++) ke += momentum[i]*momentum[i];
return ke/mass;
}
...
};
Notice that class Particle also provides a member function Charge(), which in this case simply returns 0.0, since Particle has no data describing its charge state. This function, along with others in the class, has been declared as virtual, which means that a class inheriting from Particle may override, or redefine, the behavior of this function while maintaining the same interface (i.e., the same function name and parameter list). In this way, we provide a common, fixed interface for obtaining a particle's charge from objects of class Particle and any class that may be derived from it in the future.
To use class Particle, we declare instances of it as shown below:
Particle p1, p2;
This statement constructs two concrete objects, and the data values for each object are independent. Furthermore, p1 and p2 are only known locally in the routine in which they are declared. (The practice of avoiding the use of global variables, which are accessible anywhere in the program, is another rule that modern software designers adhere to in order to keep codes modular.) To obtain the kinetic energy, for example, we use the member function through the following dot syntax:
double p1_ke = p1.Kinetic_Energy();
Now let us discuss the implementation of abstract and concrete objects in Fortran 90. F90 has two language elements that have object features: TYPE and MODULE. TYPE allows one to group data together but does not associate methods with the data. The code fragment
TYPE Particle
REAL mass
REAL, DIMENSION(0:2) :: position, velocity
END TYPE Particle
defines a data type corresponding to the member data of the C++ class Particle introduced earlier. One can declare an instance of a TYPE and access its data components as follows:
TYPE(Particle) :: p
REAL :: pmass = p%mass
By itself, a TYPE does not provide encapsulation, since any section of code that defines a TYPE and declares an instance of it may directly access its data components. Encapsulation is achieved by placing the TYPE definition inside of a MODULE. The MODULE construct is a code block containing data or TYPEs, as well as FUNCTIONs and SUBROUTINEs for manipulating this data. A MODULE may declare its contents to be PRIVATE, disallowing direct access to those contents by program units outside the MODULE. If data elements are declared PRIVATE, the MODULE may provide FUNCTIONs that are PUBLIC for accessing the data.
The following code illustrates how an abstract object similar to the C++ class Particle might be built in F90:
MODULE ParticleModule
TYPE Particle
PRIVATE
REAL mass
REAL, DIMENSION(0:2) :: position, velocity
END TYPE Particle
CONTAINS
SUBROUTINE Initialize(p,imass)
TYPE(Particle), INTENT(INOUT) :: p
REAL, OPTIONAL :: imass
INTEGER i
IF (PRESENT(imass)) THEN
p%mass = imass
ELSE
p%mass = 0.0
ENDIF
DO i=0,2
p%position(i) = 0.0
p%velocity(i) = 0.0
END DO
END SUBROUTINE Initialize
REAL FUNCTION Position(p,i)
TYPE(Particle), INTENT(IN) :: p
INTEGER, INTENT(IN) :: i
Position = p%position(i)
RETURN
END FUNCTION Position
REAL FUNCTION Velocity(p,i)
TYPE(Particle), INTENT(IN) :: p
INTEGER, INTENT(IN) :: i
Velocity = p%velocity(i)
RETURN
END FUNCTION Velocity
REAL FUNCTION KineticEnergy(p)
TYPE(Particle), INTENT(IN) :: p
INTEGER i
REAL :: ke = 0.0
DO i=0,2
ke = ke + (p%velocity(i))**2
END DO
KineticEnergy = p%mass * ke
RETURN
END FUNCTION KineticEnergy
REAL FUNCTION Charge(p)
TYPE(Particle), INTENT(IN) :: p
Charge = 0.0
RETURN
END FUNCTION Charge
END MODULE ParticleModule
The PRIVATE statement resets the default access for items within the current code block, which is normally PUBLIC. Because of this statement in TYPE Particle, the data of Particle cannot be accessed except through the FUNCTIONs defined in the enclosing MODULE ParticleModule, such as Position(p,i).
The above construction provides for an abstract object, because once this MODULE has been included in a code segment, one can define multiple instances of TYPE Particle. For example, consider the code fragment
USE ParticleModule
TYPE(Particle) :: p1, p2
Initialize(p1)
Initialize(p2)
The USE statement indicates that the contents of
ParticleModule are available for use within the current code block.
In summary, Fortran 90 does allow one to build an abstract object by
combining TYPEs and PROCEDUREs (i.e., SUBROUTINEs and
FUNCTIONs) in a MODULE. The TYPE contains the
internal data, which can be declared PRIVATE inside of a
MODULE and thus encapsulated. The MODULE provides an interface
via PROCEDUREs that are PUBLIC and operate on the contained
TYPE.
This concept of inheritance was explicitly included in the C++
syntax. For example, the class definition
specifies that class Nucleus inherits from class Particle.
We refer to Particle as the base class and Nucleus as
a derived class. Nucleus contains all the data and methods of
Particle, and it has added the new data members numProtons,
numNeutrons, and elemCharge. Note that elemCharge
(which represents the value of the electronic charge e) has been
declared static, which means that it is a single piece of data shared
by all objects of class Nucleus. The constructor for Nucleus
initializes the new data members and can pass other parameters back to the base
class constructor for the initialization of inherited data members. Also, the
virtual member function Charge() has been redefined to compute
the charge based upon the internal structure of the derived class, namely by
multiplying elemCharge (which is presumed to be initialized elsewhere)
by the number of protons in the nucleus, numProtons.
If we declare two concrete objects,
then p.Charge() returns the charge according to the definition of
Particle (i.e., returns 0.0), while n.Charge()
returns the charge as dictated in Nucleus. This is an example of
polymorphism: the function Charge() acts differently depending
on what kind of object is invoking it.
C++ also has a means of invoking polymorphic behavior for a collection of
base class objects using pointers. Via the syntax
one declares a pointer to a Particle object. That is, pptr
is used to contain the memory location of a Particle. It could be
assigned a value with the address-of symbol &, as in the
statement
which gives pptr the memory address of Particle object
p. The Particle referred to by pptr is obtained with
the dereferencing operator *, so that
returns the same value as p.Charge(), since pptr points to
p. An alternate notation for accessing members of an object by pointer
is the arrow syntax:
A powerful type of polymorphism occurs when the assignment
is made. The assignment of the address of a derived class object (such as the
Nucleus n) to a base class object pointer is allowed because a
Nucleus is a kind of Particle. Since the method
Charge() was declared virtual, the statement
now performs the member function as defined in class Nucleus! All
virtual functions are dynamically bound to their objects, so the object
referred to by the pointer pptr chooses the correct definition of the
function at run time.
This mechanism is very useful. One can refer to a collection of objects that
may be either Particle or Nucleus objects (or objects of any
other class that may be derived from Particle, such as
Atom or Molecule) with an array of pointers to
Particle objects. The Charge() member function may be called
on each object, and the correct version is invoked for each one without need for
any special coding. This gives programmers a tremendous flexibility in reusing
previously designed objects, inheriting much of the needed code, modifying only
what they must to solve a new problem, and writing general code that can operate
on a variety of objects.
To create a Nucleus object in Fortran 90, one can first define a new
TYPE that has as one of its components a Particle, plus the
additional data needed to describe a nucleus. Fortran 90 allows for one
MODULE to contain another via the USE statement. Hence, the
MODULE representing a nucleus can include ParticleModule, so
that when it comes time to define a TYPE Nucleus, the definition of
TYPE Particle is known. One then defines the PROCEDUREs needed
for manipulating a Nucleus, as illustrated in the following code
fragment:
As with class Nucleus in C++, we have added integers for the number
of protons and neutrons, and a parameter for the electronic charge. The
SAVE qualifier indicates that only one copy of the parameter
elemCharge is used for multiple invocations of the MODULE.
(This is similar to the behavior of a static member in C++.) We have
included a SUBROUTINE NucInitialize that calls Initialize on
the contained Particle and then initializes the new data. We have also
introduced FUNCTION NucCharge to calculate the charge based on the
number of protons.
In order to access the position and velocity data of TYPE Particle,
we must define new FUNCTIONs in NucleusModule that take a
Nucleus as an argument, even though all they do is turn around and call
the respective FUNCTIONs from ParticleModule! The basic reason
for this is that Fortran 90 PROCEDUREs act upon a TYPE (take
it as an argument), whereas C++ member functions are invoked on behalf of an
object. Hence, a TYPE can contain data from another TYPE by
including it as a component, but there is no way to inherit
PROCEDUREs because the two TYPEs are not interchangeable as
arguments to a PROCEDURE. In C++, the specification of member functions
that are unchanged is not necessary. The derived class inherits all of the
member functions of its base class. New code is written only for those member
functions that must change behavior.
Prior to listing the contained PROCEDUREs of NucleusModule,
we can provide INTERFACE blocks for those PROCEDUREs common to
both ParticleModule and NucleusModule. The purpose of these
blocks is to allow either version of a PROCEDURE to be called using one
generic name. For instance, INTERFACE Position allows the name
Position to be used for either FUNCTION Position or
NucPosition. Thus, if we first USE NucleusModule and then
declare (and initialize) Particle p and Nucleus n, a piece of
code such as
is perfectly valid. Using these generic interfaces, one can very
roughly approximate the notions of inheritance and virtual member
functions from C++, although the mechanism is clearly rather clumsy. Since no
PROCEDUREs are actually inherited, one must repeat the definitions
using new names, and then link these new names back to the previous names with
INTERFACEs.
In OOP parlance, we have used a has-a construct to represent what is
really an is-a relationship. The Nucleus is a
Particle in the sense that, like a Particle, it has a
position, velocity, and mass. In addition, it has an internal structure
described by the number of protons and neutrons. To represent this in Fortran
90, we had to use a has-a structure in which a Particle was
one of the data components contained inside a Nucleus.
This construction has two negative practical consequences. The first has
already been mentioned: we must recode FUNCTIONs of
ParticleModule for NucleusModule, even if there is no change
in behavior. Rewriting this code is time-consuming and can introduce bugs.
Moreover, it is this recoding of PROCEDUREs, along with the rather
cumbersome INTERFACE structure, that accounts for the disparity in the
length of the C++ and F90 code fragments needed to define a derived abstract
object, as demonstrated in our example. Inheritance allows code to be far more
compact and easier to maintain. (This problem could be sidestepped if we did not
introduce the secondary TYPE Nucleus, but then we could not create
separate instances of a Nucleus!) Secondly, we do not have the powerful
polymorphism of C++ discussed earlier. Unlike in C++, one cannot have a
Particle pointer refer to a Nucleus, since in Fortran 90 a
Nucleus has a Particle but is not itself a Particle.
Thus, there is no way to refer to a group of Particle and
Nucleus objects collectively and have FUNCTION Charge return
what is appropriate for each object.
The unspecified type T will be replaced by an actual type by the
compiler whenever a Stack object with a specific element type is
requested by the programmer. The class Stack may rely on class
T having certain data or methods available (e.g., in this case, the
assignment operator). If they are not found, a compile-time error will result
when the compiler attempts to instantiate the template.
Another common use for the template is to generalize code on the
basis of some numerical parameter, rather than by type. For example, we could
create a template for class Particle, with the number of
dimensions of our simulation domain as a template parameter.
This would allow Particle objects instantiated from this
class to function in the same way for any dimensionality. Furthermore,
by making array sizes and loop bounds template parameters of a function or
class, they become available at compile time and enable aggressive
optimization and function inlining.
The template provides another level in abstraction. While thought of
primarily as a convenience by some, templates have become increasingly
important because of the opportunity they offer for code reuse. A Standard
Template Library (STL) is now available, which contains templates
for many useful classes such as vectors, queues, and maps [10].
As with any set of C++ classes, these generalized data structures are
packaged along with the commonly used algorithms for each, thus providing the
software development community with a ``standard toolkit'' on which a wide
variety of applications may be based. The true power of the template
mechanism has only recently begun to be fully recognized. Because C++
templates involve replacement of code at compile time, one can use the
compiler to do fairly elaborate ``computations''. One example of such
computation is the use of expression templates [11].
The expression template technique addresses the problem of pairwise evaluation
of expressions involving user-defined types.
For instance, let's imagine that you have a class Matrix and you
have overloaded mathematical operators such as + and * to do
matrix operations element by element. If A, B, C, and
D are Matrix objects and you write the expression
the C++ compiler will evaluate this by first applying the overloaded operator
+ to B and C (generating a temporary Matrix
object), then applying the overloaded operator * to A and the
intermediate result (generating a second temporary Matrix), and finally
applying the overloaded operator = to assign the result to D.
Since each operation involves a loop over all the elements of the two
Matrix operands, this pairwise evaluation destroys code efficiency by
going through several consecutive loops where one would suffice and by
generating lots of intermediate temporary objects.
With the expression template technique, one creates class templates
that act as wrappers around expression operands, operators, and elements.
Operand wrappers know how to access the data inside the operand, and operator
wrappers simply apply the operation. Expression element wrappers generate a new
type that describes a combination of operators and operands, thus allowing one
to build (through the nesting of class template definitions) a type for
the entire right-hand side of a complex expression. This type is then evaluated
and assigned to the object on the left-hand side using inlined, highly optimized
code that has been built by the compiler in the process of instantiating the
templates. In effect, the compiler automatically generates code for
efficient evaluation of complex expressions involving user-defined objects and
operators! Tests of this technique indicate that one can expect performance that
is up to 95% of hand-coded C kernels [11].
Of course, one receives an enormous benefit for this small cost in performance;
namely, the ability to code scientific equations using high-level objects in an
easily understandable manner.
Nevertheless, for a wide variety of computational kernels running on several
different computing platforms, one finds that there is little or no inherent
performance advantage acquired by sticking to a procedural language such as F77
or C vs. an object-oriented language, or by choosing Fortran 90 over C++. To
verify this statement, we have thoroughly examined the Livermore Fortran Kernels
(LFK). The LFK is a well-known set of computational kernels that exhibit a wide
range of performance. The computational algorithms represented range from 1-D
hydrodynamics in Kernel 1 to 2-D particle-in-cell in Kernel 13 to general linear
recurrence equations in Kernel 19. The performance range extends from nearly
100% to less than 1% of theoretical machine peak speed.
Minor modifications to the 24 computational kernels included separating each
kernel into a stand-alone benchmark, converting the Fortran source to C source
(this entails loop re-ordering in some cases), and compiling with standard
compiler options for best performance. As indicated in Figures 1 thru 4, the
computational efficiency of the code produced by the four languages is
consistent across the platforms examined. (The one clear exception is the F90
compiler on the Sun Ultra-2 platform; however, this product was provided to Sun
by a third-party vendor and is likely not as well tuned for this computer
architecture as the other compilers provided by Sun. An F90 compiler from Sun
for the Ultra-2 was not available at the time of this writing.) There are
noticeable exceptions for some kernel/platform/language combinations. These
differences can be understood to result from a particular compiler recognizing a
computational pattern and either making clever internal optimizations that
reduce overall work or calling upon a tuned assembly-language routine. (For a
complete discussion of each of these exceptions, the reader is directed to the
WWW page at http://www.acl.lanl.gov/Hinker/Kernels.html.)
The overall result is that computational efficiency shows no direct
correspondence to source language for a wide range of computational kernels.
Figure 1: Language performance comparison on SGI R10000 processor
Figure 2: Language performance comparison on IBM RS6000 processor
Figure 3: Language performance comparison on Sun Ultra-2 workstation
Figure 4: Language performance comparison on T3D (DEC Alpha processor)
One possible objection to these results might be that only the procedural
aspects of the four languages are examined by the LFK, and that once the
object-oriented language features of F90 or C++ are utilized, performance will
suffer. While it is true that there is a certain amount of overhead involved
when taking advantage of an object-oriented programming style, by making the
proper level of abstraction of the problem domain, one can assure that typical
large-scale scientific computing applications spend a vast majority of their
computational time actually crunching numbers. Determining the best means of
assessing and minimizing this data abstraction penalty is a topic of
current research (e.g., see [16]).
Certain useful non-OOP features are found exclusively in both languages.
However, all of the additional features of Fortran 90 can be readily included in
C++ by the development of class libraries; that is, without any language
extensions. In contrast, the inclusion of the important features exclusive to
C++ in Fortran 90 will require new language syntax. The most critical feature
missing in Fortran 90 is the template, which allows C++ programmers to
build portable, highly reusable libraries and to dramatically improve the
efficiency of the evaluation of expressions involving user-defined types.
With regard to non-syntax issues, the languages are comparable until one
considers the broader software development community. For high-performance
computing, one may find either Fortran 90 or C++ to be more mature. On Symmetric
Multi-Processors (SMPs), C++ is likely to be further advanced, while on
Massively Parallel Processors (MPPs), Fortran 90 compilers may be more highly
optimized. In the general software engineering community, however, C++ is orders
of magnitude more common than F90. Free software and libraries useful for
physics modeling as well as other problem domains will be more rapidly developed
in the C++ community. Thus, software developers in the physics community who
hope for greater mobility or increased leveraging with the commercial sector
would be better served learning C++.
[2] G. Booch, Object-Oriented Development, IEEE
Transactions on Software Engineering, 12, (1986), 211.
[3] G. Buzzi-Ferraris, Scientific C++: Building
Numerical Libraries the Object-Oriented Way, (Addison-Wesley, Reading, MA,
1993).
[4] T. M. R. Ellis, I. R. Phillips, and T. M. Lahey,
Fortran 90 Programming, (Addison-Wesley, Menlo Park, CA, 1994).
[5] M. Metcalf and J. Reid, Fortran 90
Explained, (Oxford University Press, Oxford, England, 1990).
[6] B. Stroustrup, The C++ Progamming Language,
(Addison-Wesley, Reading, MA, 1991).
[7] S. Lippman, C++ Primer, (Addison-Wesley,
Reading, MA, 1991).
[8] V. K. Decyk, C. D. Norton, and B. K. Szymanski,
Introduction to Object-Oriented Concepts using Fortran 90, Institute of
Plasma and Fusion Research Report No. PPG-1559, (1996).
[9] G. Booch, Object-Oriented Analysis and Design
with Applications, (Benjamin-Cummings, Redwood City, CA, 1994).
[10] D. R. Musser and A. Saini, STL Tutorial and
Reference Guide: C++ Programming with the Standard Template Library,
(Addison-Wesley, Reading, MA, 1996). [Note: Code for the Standard Template
Library may be downloaded from the WWW via the URL http://isx.com/~dsupel/www/sponge/stl
or via anonymous FTP at ftp.cs.rpi.edu in directory
/pub/stl/book.]
[11] T. Veldhuizen, Expression Templates, C++
Report, 7 (June 1995).
[12] D. A. Young, Object-Oriented Programming with
C++ and OSF/Motif, (Prentice Hall, Englewood Cliffs, NJ, 1992).
[13] R. Parsons and D. Quinlan, A++/P++ Array Classes
for Architecture-Independent Finite-Difference Calculations, Proc. of the
2nd Annual Object-Oriented Numerics Conference, (1994).
[14] E. Gamma, R. Helm, et al., Design Patterns,
(Addison-Wesley, Reading, MA, 1995).
[15] C. D. Norton, B. K. Szymanski, and V. K. Decyk,
Object-Oriented Parallel Computation for Plasma Simulation, Communications
of the ACM, 38, (1995). [Note: also available on the WWW via the URL
http://www.acm.org/pubs/cacm/previous/oct95_toc.html.]
[16] A. D. Robison, The Abstraction Penalty for Small
Objects in C++, POOMA '96 Conference, (1996). [Note: also available on
the WWW via the URL http://www.acl.lanl.gov/Pooma96/abstracts/robison.html.]
Inheritance and Polymorphism
Suppose one has developed an abstract
object with a data structure and methods well suited to a given problem. Now
imagine that a new problem comes along in which one needs a new data structure
identical to the old one except for some additional data elements and the
methods needed to work with this new data. Rather than developing a completely
new abstract object, one would like to define a new abstract object that
inherits the old data and methods, can alter the behavior of some functions to
match the semantics of the new object, and can add new data or methods as
needed.
class Nucleus : public Particle {
private:
int numProtons, numNeutrons;
static double elemCharge;
public:
Nucleus(int inumProtons, int inumNeutrons, double imass=0.0)
: Particle(imass) {
numProtons = inumProtons;
numNeutrons = inumNeutrons;
}
~Nucleus() { }
double Charge() const {
return numProtons*elemCharge;
}
};
Particle p;
Nucleus n;
Particle *pptr;
pptr = &p;
(*pptr).Charge();
pptr->Charge();
pptr = &n;
pptr->Charge();
MODULE NucleusModule
USE ParticleModule
TYPE Nucleus
PRIVATE
TYPE(Particle) p
INTEGER numProtons, numNeutrons
END TYPE Nucleus
REAL, PRIVATE, PARAMETER :: elemCharge = 1.6e-19
SAVE
INTERFACE Initialize
MODULE PROCEDURE Initialize, NucInitialize
END INTERFACE Initialize
INTERFACE Position
MODULE PROCEDURE Position, NucPosition
END INTERFACE Position
! Similar interfaces for Velocity, KineticEnergy, Charge
...
CONTAINS
SUBROUTINE NucInitialize(n,inumProtons,inumNeutrons,imass)
TYPE(Nucleus), INTENT(INOUT) :: n
INTEGER, INTENT(IN) :: inumProtons, inumNeutrons
REAL, OPTIONAL, INTENT(IN) :: imass
Initialize(n%p,imass)
n%numProtons = inumProtons
n%numNeutrons = inumNeutrons
END SUBROUTINE NucInitialize
REAL FUNCTION NucPosition(n,i)
TYPE(Nucleus), INTENT(IN) :: n
INTEGER, INTENT(IN) :: i
NucPosition = Position(n%p,i)
RETURN
END FUNCTION NucPosition
! Similar functions for NucVelocity, NucKineticEnergy
...
REAL FUNCTION NucCharge(n)
TYPE(Nucleus), INTENT(IN) :: n
NucCharge = n%numProtons * elemCharge
RETURN
END FUNCTION NucCharge
END MODULE NucleusModule
REAL :: px = Position(p,0)
REAL :: nx = Position(n,0)
Non-OOP Language Features
In the previous sections, we have highlighted
some differences in C++ and Fortran 90 support of what we consider to be
important object-oriented programming features. There are a few other language
features that have nothing to do with the OOP model, but nevertheless can be
very useful to the scientific applications programmer. Here we examine some of
these capabilities, in order of importance to scientific programming.
C++ Features Missing in Fortran 90
Templates
Class and function templates allow for
parameterization, whereby a class or function can be implemented in a
manner that is independent of the types of objects it contains or operates on.
One example is a class Stack that maintains a stack of objects of some
arbitrary type. The definition of such a class might look like the
following:
template < class T >
class Stack {
private:
T* theStack;
T* Top;
public:
Stack(int size) { theStack = Top = new T[size]; }
~Stack() { delete [] theStack; }
void push(T newItem) { *Top++ = newItem; }
T pop() { return *(--Top); }
};
template < unsigned int Dim >
class Particle {
private:
...
double position[Dim], velocity[Dim];
public:
Particle(double imass=0.0) {
mass = imass;
for (int i = 0; i < Dim; i++) {
position[i] = 0.0;
velocity[i] = 0.0;
}
}
...
virtual double Kinetic_Energy() const {
double ke = 0.0;
for (int i = 0; i < Dim; i++) ke += velocity[i]*velocity[i];
return mass*ke;
}
...
};
D = A * (B + C);
Pointer Casting
Pointer casting allows one to change the type of object
that a pointer points to. An example is to cast a pointer to a user-defined
complex number type to a pointer to double. In this case, dereferencing
might give the real part of the complex number (depending upon how the
conversion from complex number to double was defined). Casting pointers
is often frowned upon because it can lead to invalid memory access, although
recent additions to the C++ standard have made casting safer and more explicit
(see Appendix A of [6]).
In any case, pointers are indispensable in using general callbacks, as one must
with the Motif windowing system [12].
In addition, pointer casting may be used in order to access polymorphic object
behavior, by casting a pointer to an object of some base class to be a pointer
to an object of a derived class.
Exception Handling
Exception handling allows convenient transfer of
program control when an error occurs, by using the try-throw-catch
mechanism. In C++, one can try execution of a particular piece of code by
placing it inside a try block. This code may then throw an exception if
a certain sort of run-time error is detected by using a throw
statement. The try block is followed by a catch block that
will catch a particular exception that may have been thrown and respond by
executing the specified code. This provides a very clean, extensible way in
which the programmer can deal with possible errors.
Fortran 90 Features Missing in C++
Arrays
Mathematical arrays are present as elementary types in Fortran
90, but must be provided as a class library in C++. (In C++, as in C, one has
the ability to allocate arrays of objects of elementary types such as
int or float, but one cannot utilize array syntax to
request mathematical array operations.) The operations defined by such a class
library may not be as efficient as those provided for the intrinsic Fortran 90
arrays. The Fortran 90 arrays have another nice feature as well, which is that
when passed into a PROCEDURE, they can then be queried to determine
their shape (the lengths of the axes). This feature is typically also
provided in C++ array class libraries.
Specifiable Precision
Specifiable precision of floating-point numbers is
possible in Fortran 90 through use of the KIND syntax. One can specify
how many significant digits in a floating-point number must be retained, rather
than just declaring a variable to be REAL or DOUBLE PRECISION.
This makes it easier to ensure that the numbers one is using are sufficiently
accurate for the calculation at hand. A similar capability could be provided by
a class library in C++.
Addition of Missing Features
In summary, the most important features of
Fortran 90 that are missing in C++ can be replicated by employing class
libraries (see [13]
for discussion of an array class library implementation). On the other hand, for
Fortran 90 to add some of the important C++ features, most notably
templates and inheritance, the syntax of the language will have to be
enhanced. It is questionable whether Fortran 90, in its current form, can
deliver efficient code written in an object-based manner, given the problem of
evaluating complex expressions involving user-defined types and operations.
Non-Syntax Issues
Non-syntax issues can also be of importance in
deciding which route to take to object-oriented programming. Three non-syntax
issues worthy of examination are availability, usage, and performance. For each
of these issues, customer pressure can help ensure that one language or the
other is better. Below we briefly discuss each of these issues.
Availability
For the vast majority of computers, C++ compilers are more
widely available than F90 compilers and are likely to remain so, primarily
because more commercial tools and software libraries are currently being
developed in C++. Fortran 90 compilers may become more widely available on
supercomputing platforms because of pressures applied by the scientific
computing community. However, the supercomputer industry is no longer a
stand-alone market and has become highly leveraged with the commercial sector.
Developing scientific codes in C++ will enable researchers to benefit from the
extensive commercial efforts in C++ software development. In contrast, the F90
code developer runs the risk of writing codes in a language which is considered
to serve only a niche market.
Usage
This issue is linked with availability, so many of the same
arguments apply. For now (and we believe for the foreseeable future), C++ is the
most popular object-oriented language in the software engineering community.
Consequently, if one might ever be employed outside of the scientific sector,
then it is better to have C++ programming skills. Learning C++ gives one the
added advantage of drawing on a much larger developer community, which is
producing not only a large amount of literature on its use (e.g., see [14]),
but also a large body of reusable code modules. For example, the Standard
Template Library [10]
has been proposed to be part of the ANSI C++ standard. Because of the much
smaller user base, comparable efforts in F90 are bound to lag behind.
Performance
If two languages can represent the same concepts, both in
terms of constructs and actions, then there is no reason that they cannot
perform equally well. Hence, in language performance comparisons, one is mostly
comparing the skill of compiler writers, not the languages themselves. On the
other hand, one language may almost demand that certain performance features be
written, while the other may not. For example, Fortran 90 has its array object
built into the language. Thus, every F90 compiler writer should ensure that
arrays perform optimally for the given architecture, whether it be a massively
parallel, single-processor, or modest multi-processor machine. In C++, ``smart
array'' objects (knowing their sizes and other attributes) must come from a
class library. The operations for such objects might not be optimally written
for each architecture. Thus, Fortran 90 users may get better array performance,
especially on leading-edge machines [15].
Summary and Discussion
C++ is a full-featured, object-oriented
programming language. It supports the notions of abstraction, encapsulation,
modularity, and hierarchy. Moreover, inheritance and dynamic binding provide C++
with powerful polymorphism features. Fortran 90 allows object-oriented
programming to some degree. However, in Fortran 90 the concept of an abstract
object must be constructed using two distinct syntax elements, the TYPE
and the MODULE. With this approach, one can represent the
has-a relationship of two objects, but not the is-a
relationship needed to construct an object hierarchy. This implies that code
reuse will come more easily in C++.
References
[1] A. Eliens, Principles of
Object-Oriented Software Development, (Addison-Wesley, Reading, MA, 1994).