Data Structures and Algorithms: Objects and ADTs

Data Structures and Algorithms

2.1 Objects and ADTs

In this course, we won't delve into the full theory of object-oriented design. We'll concentrate on the pre-cursor of OO design: abstract data types (ADTs). A theory for the full object oriented approach is readily built on the ideas for abstract data types.

An abstract data type is a data structure and a collection of functions or procedures which operate on the data structure.

To align ourselves with OO theory, we'll call the functions and procedures methods and the data structure and its methods a class, i.e. we'll call our ADTs classes. However our classes do not have the full capabilities associated with classes in OO theory. An instance of the class is called an object . Objects represent objects in the real world and appear in programs as variables of a type defined by the class. These terms have exactly the same meaning in OO design methodologies, but they have additional properties such as inheritance that we will not discuss here.

It is important to note the object orientation is a design methodology. As a consequence, it is possible to write OO programs using languages such as C, Ada and Pascal. The so-called OO languages such as C++ and Eiffel simply provide some compiler support for OO design: this support must be provided by the programmer in non-OO languages.

2.2 An Example: Collections

Programs often deal with collections of items. These collections may be organised in many ways and use many different program structures to represent them, yet, from an abstract point of view, there will be a few common operations on any collection. These might include:

`create`	Create a new collection
`add`	Add an item to a collection
`delete`	Delete an item from a collection
`find`	Find an item matching some criterion in the collection
`destroy`	Destroy the collection

2.2.1 Constructors and destructors

The create and destroy methods - often called constructors and destructors - are usually implemented for any abstract data type. Occasionally, the data type's use or semantics are such that there is only ever one object of that type in a program. In that case, it is possible to hide even the object's `handle' from the user. However, even in these cases, constructor and destructor methods are often provided.

Of course, specific applications may call for additional methods, e.g. we may need to join two collections (form a union in set terminology) - or may not need all of these.

One of the aims of good program design would be to ensure that additional requirements are easily handled.

2.2.2 Data Structure

To construct an abstract software model of a collection, we start by building the formal specification. The first component of this is the name of a data type - this is the type of objects that belong to the collection class. In C, we use typedef to define a new type which is a pointer to a structure:

typedef struct collection_struct *collection;

Note that we are defining a pointer to a structure only; we have not specified details of the attributes of the structure. We are deliberately deferring this - the details of the implementation are irrelevant at this stage. We are only concerned with the abstract behaviour of the collection. In fact, as we will see later, we want to be able to substitute different data structures for the actual implementation of the collection, depending on our needs.

The typedef declaration provides us with a C type (class in OO design parlance), collection. We can declare objects of type collection wherever needed. Although C forces us to reveal that the handle for objects of the class is a pointer, it is better to take an abstract view: we regard variables of type collection simply as handles to objects of the class and forget that the variables are actually C pointers.

2.2.3 Methods

Next, we need to define the methods:

collection ConsCollection( int max_items, int item_size ); void AddToCollection( collection c, void *item ); void DeleteFromCollection( collection c, void *item ); void *FindInCollection( collection c, void *key );

Note that we are using a number of C "hacks" here. C - even in ANSI standard form - is not exactly the safest programming language in the sense of the support it provides for the engineering of quality software. However, its portability and extreme popularity mean that it is a practical choice for even large software engineering projects. Unfortunately, C++, because it is based on C, isn't much better. Java, the latest fad in the software industry, shows some evidence that its designers have learned from experience (or actually read some of the literature in programming language research!) and has eliminated some of the more dangerous features of C.

Just as we defined our collection object as a pointer to a structure, we assume that the object which belong in this collection are themselves represented by pointers to data structures. Hence in AddToCollection, item is typed void *. In ANSI C, void * will match any pointer - thus AddToCollection may be used to add any object to our collection. Similarly, key in FindInCollection is typed void *, as the key which is used to find any item in the collection may itself be some object. FindInCollection returns a pointer to the item which matches key, so it also has the type void *. The use of void * here highlights one of the deficiencies of C: it doesn't provide the capability to create generic objects, cf the ability to define generic packages in Ada or templates in C++.

Note there are various other "hacks" to overcome C's limitations in this area. One uses the pre-processor. You might like to try to work out an alternative approach and try to convince your tutor that it's better than the one set out here!

2.2.4 Pre- and post-conditions

No formal specification is complete without pre- and post-conditions. A useful way to view these is as forming a contract between the object and its client. The pre-conditions define a state of the program which the client guarantees will be true before calling any method, whereas the post-conditions define the state of the program that the object's method will guarantee to create for you when it returns.

Again C (unlike Eiffel, for example) provides no formal support for pre- and post-conditions. However, the standard does define an assert function which can (and should!) be used to verify pre- and post-conditions [man page for assert]. We will see how this is used when we examine an implementation of our collection object. Thus pre- and post-conditions should be expressed as comments accompanying the method definition.

Adding pre- and post-conditions to the collection object would produce:
Select to load collection.h

Aside

In order to keep the discussion simple at this stage, a very general specification of a collection has been implied by the definitions used here. Often, we would restrict our specification in various ways: for example, by not permitting duplicates (items with the same key) to be added to the collection. With such a collection, the pre- and post-conditions can be made more formal:
Select to load ucollection.h

Note how the pre- and post-conditions now use the FindInUCollection function to more precisely define the state of the object before and after the method has been invoked. Such formal pre- and post-conditions are obviously much more useful than the informal English ones previously specified. They are also easier to translate to appropriate assertions as will be seen when the implementation is constructed.

2.2.5 C conventions

This specification - which all a user or client of this object needs to see (he isn't interested in the implementation details) - would normally be placed in a file with a .h (h = header) suffix to its name. For the collection, we would place the specifications in files called collection.h and Ucollection.h and use the C #include facility to import them into programs which needed to use them. The implementation or body of the class is placed in a file with a .c suffix.

References

Some additional sources of information on Object Orientation:

What is Object-Oriented Software?
Basic Principles and Concepts of Object-Orientation - an extensive set of notes of OO concepts. Unfortunately most of the bibliography links seem to be outdated.
Object Oriented Programming - notes for a class at Indiana University (based on Objective-C).
Object Oriented Programming Languages - summary of OO programming languages (with links to full details).

Key terms

abstract data type (ADT): A data structure and a set of operations which can be performed on it. A class in object-oriented design is an ADT. However, classes have additional properties (inheritance and polymorphism) not normally associated with ADTs.