The Theory of Classification
Part 12: Building the Class Hierarchy
Anthony J H Simons, Department of Computer Science, University
of Sheffield, U.K. |
 |
COLUMN

PDF Version |
1 INTRODUCTION
This is the twelfth article in a regular series on object-oriented
type theory for non-specialists. Readers following the series will
by now have gained some experience in theoretical models and their
uses. Previous articles have gradually built up models of objects [1],
types [2] and classes [3] in the -calculus. Inheritance has been shown
to extend both type schemes [4] and implementations [5]. The most recent
article combined the type and implementation aspects of inheritance
[6], introducing the typed notation that will be used in the rest of
this series. The theoretical model has already revealed some interesting
differences between the notions of type and class. We have shown that
one group of languages, exemplified by C++ and Java, uses subtyping [2, 3, 5] as the basis for type compatibility, whereas another group,
exemplified by Smalltalk and Eiffel, uses subclassing [3, 4, 5], which
is a distinct formal relationship in the Theory of Classification.
In this article, we review the whole model so far, to demonstrate
more of its formal modelling power. The theoretical model is able to
represent
the whole spread of object-oriented concepts, such as objects, types,
classes, abstract classes and interfaces. It can handle the notion
of object creation, class extension through inheritance, type compatibility
and interface matching. We shall build up a simple model of a class
hierarchy, demonstrating all of these concepts. First, we shall briefly
review the elements of the model.
2 SIMPLE OBJECTS AND TYPES
Objects are modelled as simple records,
whose fields are values, representing attributes, or functions, representing
methods. Types are modelled
as record types, whose fields are the corresponding type signatures
of the attributes or methods. The following introduces a simple co-ordinate
type and an instance of the type:
 We use capitalisation to indicate type
names, and lower case for object names. The above coordinate is a
specific instance of Coord at the
location (2, 3). To construct such an instance, we can define the constructor-function
makeCoord:

This accepts a pair of
arguments a, b, and returns a record, corresponding
to a Coord instance. We can then create coord from first
principles by the constructor-call:

We access its fields using the “dot” syntax:
coord.x 3. The fields x, y have fixed values once
the object is
constructed. We can think of these fields either as constant public
attributes, or else we can think of them as “unary methods”,
functions that accept no argument and access an encapsulated value1.
We typically take the second view, so that all record fields can
be thought of as functions.
3 RECURSIVE OBJECTS AND TYPES
Objects are frequently recursive,
since methods may invoke further methods on self, the current
object. Types are also recursive, since
methods may accept or return arguments of the same type as the self-type.
These two kinds of recursion are related, but independent. We use a
standard technique in the -calculus to introduce and bind the recursion,
called fixpoint finding [1] and separate fixpoints are needed
at the object-level and the type-level. The technique involves the
use of
generators (also called functionals), which are functions
of the self-value (or self-type). The following introduces type- and
object-generators
for a more sophisticated Cartesian point type, which has a recursive
equal method:

The type
generator GenPoint is not yet a record type, but rather a
function of (the
self-type parameter) that returns a record type. We construct the recursive
Point type by binding recursively
to the function body, using the fixpoint
finder Y (see [1] for full details):

In this way, we
build the recursive Point type from first principles. Note how, once
the definition is complete, we may unroll Point (denoted
by the evaluation
step ) to a record type containing recursive reference to the Point type.
Likewise, the object generator genPoint is not yet an object instance,
but rather a function that returns this instance. It accepts two arguments: stands
for
the self-type and self for the self-value. To create a point-instance
from this generator, we need to supply a suitable type for and
then bind self recursively
over the rest of the function body. In the first step, we choose to supply the
type Point, and the returned result is a generator, a simple function
of self:
 In the second step,
this simple generator may be recursively fixed using Y:

to bind self over the rest of the body. Note again how, after
unrolling, the defined point instance contains recursive
references to point.
If we consider
that the type Point was itself the result of a fixpoint operation, we
could equally
define point in the style:

illustrating how it takes two fixpoints
to bind the recursions at object- and
type-level in each instance. From a theoretician’s point of view, this
presents some challenges. Bruce and Mitchell [7] were the first to show that
Y existed both at both the type- and object-levels, and prove that fixpoint operations
converged at infinity, provided that objects were records of functions. (Convergence
fails if an object has a field which directly returns self; however,
we can always
convert the identity method into a function accepting an empty argument – see
the earlier footnote1).
4 CONSTRUCTING OBJECT INSTANCES
In the previous article [6], we
settled on a style for defining all classes using type- and object-generators,
as illustrated above.
It turns out that this is
most useful, because we can create subclasses by adapting generators [4, 5],
in a style that mimics inheritance. However, the exact process of constructing
distinct object instances was not fully described. In fact, genPoint is a generator
for a specific instance, with point.x 3. To make this function
more general-purpose, we should force it to accept extra initialisation arguments:

Here,
initPoint is an extended version of genPoint, which
accepts a type , then a pair of Integer arguments a, b,
then the usual value for self. This function
can be used to construct objects having distinct x, y field
values in the following
style:

Note
the order of application in the creation of p1: first we supply
the precise type, Point; then we supply the initialisation
values (4, 5). This returns a
simple object generator, having the form: (self :
Point).{ … },
which we can fix using Y to
bind self recursively over the rest of the generator
body.
If we wanted, we could define a simple object constructor, makePoint,
in the same style as makeCoord (see section 2), which uses
the extended function
initPoint
internally:

In this, you can think of Y as taking a fixed snap-shot
of the flexible type-structure and object-structure represented by
the two generators.
Constructors in practical
object-oriented languages always create a specific instance of a specific type.
5
INVOKING OBJECT METHODS
Having created two distinct instances of Point,
we can simulate their behaviour in the theory, by evaluating expressions
representing method
invocations, such
as:

The steps shown above are mostly single evaluation
steps in the calculus, showing on the right which rule applied in each
case. In particular,
we use the record
selection rule to access the methods for equal and x, and the function application
rule when applying the result of p1.equal to the argument p2. The details of
primitive Integer and Boolean operations are omitted here. We assume that suitable
definitions exist for these types.
6 ROOTING A CLASS HIERARCHY
Within the theory, we can model the
notion of a class hierarchy, starting with a root class of all objects.
We shall later derive other classes
by extending
the root class. In this example, we shall assume that the root class only defines
a single method, equal, and that the default implementation of this method
is identity of reference, represented by ==.

The type generator GenObject describes the type-shape of the most
general kind of object, saying that it has an equal method.
The object generator
genObject provides the default implementation of the equal method,
and restricts the applicability of this method to arguments of the
type (o
: ranges
over the family of types expressed by the constraint: .
In earlier articles [3, 4] we described how this expresses exactly the notion
of a class,
a family
of related types. This constraint [8] accepts all record types that have at
least an equal method, with a type signature Boolean.
The
Point from section 3 above appears to match this pattern. It is possible
to derive Point’s generators from Object’s generators,
in the style that mimics the operation of inheritance in object-oriented
programming. Recall
that is union with override,
creating an extended record by combining a base record with a record of extra
methods [5]:

The bold
highlights indicate how the old type-generator and object-generator
are reused in the definition of the new, extended functions. At the type-level,
we added the types of the x, y fields. At the object-level, we added implementations
for these, and also redefined equal to compare the x, y field values, rather
than test for reference identity. The right-handed preference of causes
the new version of equal to override the default version provided in the
root class,
when the extra record is combined with the base record [5].
To establish that
the Point- and Object-classes are in a proper hierarchical relationship,
we need to show that the Point-interface is compatible with
the Object-interface. This is done formally using the Classify rule [4],
in which
we have to compare the two type-generators for a pointwise subtyping relationship:

We can show that the relationship holds
for a single dummy exemplar type t:

and by generalising t to all types ,
assert that the relationship holds everywhere. As a result, we maintain
that all types
belonging to the Point-class will
also belong to the Object-class.
7 INTERMEDIATE ABSTRACT CLASSES
We can also define the notion of
an abstract class. This is expressed by a pair of generators in which
full type information is given in
the type-generator,
but some of the implementation information is left undefined in the object-generator.
Earlier, we used to
represent the undefined value [1]. In -calculus,
any expression
which simplifies to is
the formal equivalent of raising an exception in a
practical programming language. This is also quite suitable for representing
abstract methods,
since it is an error to invoke them (instead, you would expect the abstract
methods to be overridden in a concrete subclass).
We seek to define a Shape-class,
the abstract ancestor of geometric shapes, such as Circle and Rectangle,
which provides a concrete origin method,
indicating its screen coordinates, but an abstract area method, for which
no implementation
can yet be given. Furthermore, we define this class by extension from the
root Object-class above.

In the type-generator GenPoint, the extra type-information
is highlighted in bold. The object-function initShape is written
in the same style as
initPoint, with extra initialisation arguments, since we want
to be able to supply the
initial Point coordinate p at which a Shape is
to be located. Given this argument
p,
we can define origin to return p, and can redefine equal to
compare the
origins of two Shapes – this in turn uses the equal
method defined earlier for Points. Note the use of in
the body of the area method, indicating that this
is so far undefined.
To demonstrate what happens in the model when you try
to invoke an abstract method, we exceptionally provide a simple constructor
for abstract Shapes,
so that we
can create an instance and invoke the abstract method (normally, no constructor
would be provided, since the class is abstract):

The result of attempting
to select an abstract method is the undefined value ?, equivalent to
raising an exception.
8 FINAL CONCRETE CLASSES
We now seek to derive a concrete Rectangle-class
as a subclass of the abstract Shape-class, with an implementation of
its area method. The
type-generator
declares the extra signatures for the width and height fields, while
the object-function supplies implementations for these, and also provides
a
concrete implementation
for the area method, and re-implements the equal method to compare
width and height, in addition to origin:

By the usual operation of , the right-hand,
concrete version of area is preferred, overriding the abstract version
inherited from the Shape-class;
likewise, the
new version of equal is preferred, replacing the inherited version.
The Rectangle-class is now fully defined, with suitable implementations
for
all of its methods.
In the object-function initRectangle, the initialisation-value
is of a different type than the argument supplied to initShape,
since we
need to
initialise
a rectangle with (all of) its origin, width and height values.
Formally, this
initialiser
is a single value, a tuple of the product type Point x Integer x Integer,
whose projections we access implicitly. Notice how the initialiser
passed back to
initShape in the inheritance-expression (on the left of ) is just
the p : Point value,
the first projection from the current initialiser. This models the
notion of passing back some construction arguments to a superclass.
We
can provide a simple constructor for Rectangle objects, in the usual
way, which fixes the object- and type-level recursions to construct
an instance
of this exact type:

and with this, we may construct
a number of distinct Rectangle instances:

The instances r1, r2 : Rectangle have different
values for their width and height fields, although they happen to share
their origin in this
example. Also, their
area and equal methods invoke further methods (recursively) on the
same instance,
as intended.
9 TYPE COMPATIBILITY AND INTERFACE MATCHING
The Rectangle instances
should be type-compatible with the Shape-class interface, and transitively
with the Object-class interface. Methods
defined for these
general classes should be type-correct when applied to instances of
the specific Rectangle type. One way of checking this is to see whether Rectangle is included
in the type-family of each class [4]. First, we want to see if Rectangle is
in the
class of Shapes:

and secondly, whether Rectangle is in the class
of Objects:

In both cases, the answer is yes, because Rectangle extends the interface provided by each superclass. In particular, the equal method was originally
declared in the Object-class, with the type: Boolean, so applying
this to a Rectangle instance simply produces the substitution: {Rectangle/ }
and yields the specifically-typed version: Rectangle.equal
: Rectangle Boolean. Similarly, the area method was originally declared in the Shape-class
with the
type: .area : Integer, so applying this
to a Rectangle instance produces the same substitution: {Rectangle/ }
and yields the specifically-typed
version: Rectangle.area : Rectangle Integer.
Looking at the unrolled
version of the Rectangle type above, we can see that the redefined
versions of these methods have exactly
the
same types.
So,
Rectangle matches all the expected superclass interfaces, as intended.
These interfaces
were defined as part of the type-information associated with a
class. However, some practical programming languages allow the
definition
of interfaces
that are not associated with any class.
In the theory, the concept
of an independent interface may be represented exactly by a type-generator,
without a corresponding object-generator.
If we wanted
to specify (for example) a Locatable interface for all object types
providing an
origin method, we could do so using just the type generator:

and then give Locatable variables the polymorphic
type: ?(? <: GenLocatable[?]).
It is clear that both the Shape and Rectangle classes satisfy this
interface, by the pointwise subtyping condition expected in the
Classify rule [4]:

and the interested reader
is encouraged to prove this, using the same kind of strategy as demonstrated
in section 6 above.
10 CONCLUSION
This article has focused on two main areas: how to
construct specific object instances, and the development of a simple
class hierarchy.
The overall
aim was to demonstrate how the Theory of Classification can model
a variety of
object-oriented
concepts, including objects, types, classes, abstract classes and
interfaces.
Regarding object construction, we showed how generators
can be extended into flexible object-creation functions with initialisation
arguments.
Furthermore,
initialisation values may be passed back to the superclass functions,
mimicking the behaviour of real object-oriented languages. From
a formal perspective,
creating a unique instance of an exact type always involves taking
a double fixpoint,
one for the type-generator and one for the object-generator.
Regarding
class hierarchy development, we gave examples of recognisable classes,
with mixtures of default, abstract and concrete methods.
Note especially
how the type- and implementation-aspects were able to evolve independently,
according
to need. A class may introduce a new method, may declare an abstract
method, or may re-implement an existing method, replacing it with
a more appropriate
version. All of this is handled uniformly within the theory. Furthermore,
we have shown that classes derived by inheritance (using generators)
give rise
to object constructors, which create objects of exact types (after
fixpoints are
taken). These exact types match the expected interfaces of their
superclasses, and also of separately-declared interfaces, where
appropriate. The
matching condition (Classify [4]) is exactly the same in both cases,
demonstrating
the economy of
the theory.
Footnotes
1 It is possible to think of
simple fields as functions accepting an empty argument, eg: x : Empty
Integer, and think
of all method invocations as supplying the empty value ? automatically,
eg: coord.x( 2. REFERENCES
[1] A J H Simons, “The theory of classification, part
3: Object encodings and recursion”, in Journal of Object
Technology,
vol. 1, no. 4, September-October 2002, pp. 49-57. http://www.jot.fm/issues/issue_2002_08/column4
[2] A J H Simons, “The theory of classification, part 4: Object
types and subtyping”, in Journal of Object Technology, vol. 1,
no. 5, November-December 2002, pp. 27-35. http://www.jot.fm/issues/issue_2002_11/column2
[3] A J H Simons, “The theory of classification, part 7: A class
is a type family”, in Journal of Object Technology, vol. 2, no.
3, May-June 2003, pp. 13-22. http://www.jot.fm/issues/issue_2003_05/column2
[4] A J H Simons, “The theory of classification, part 8: Classification
and inheritance”, in Journal of Object Technology, vol 2,
no. 4, July-August 2003, pp. 55-64. http://www.jot.fm/issues/issue_2003_07/column4
[5] A J H Simons, “The theory of classification, part 9:
Inheritance and self-reference”, in Journal of Object
Technology,
vol. 2, no. 6, November-December 2003, pp. 25-34. http://www.jot.fm/issues/issue_2003_11/column2
[6]
A J H Simons, “The theory of classification, part 11: Adding
class types to object implementations”, in Journal of
Object Technology, vol 3, no. 3, March-April 2004, pp. 7-19.
http://www.jot.fm/issues/issue_2004_03/column1
[7] K Bruce and J Mitchell, “PER models of subtyping, recursive
types and higher-order polymorphism”, Proc. 19th ACM
Symp. Principles of Prog. Langs., (1992), 316-327.
[8] P Canning, W
Cook, W Hill, W Olthoff and J Mitchell, “F-bounded polymorphism
for object-oriented programming”, Proc. 4th Int. Conf.
Func. Prog. Lang. and Arch. (Imperial College, London, 1989), 273-280.
About the author

|
 |
Anthony Simons is a Senior Lecturer
and Director of Teaching in the Department of Computer Science,
University of Sheffield, where he leads object-oriented research
in verification and testing, type theory and language design,
development methods and precise notations. He can be reached at a.simons@dcs.shef.ac.uk.
|
Cite this column as follows: Anthony J.H. Simons: “The Theory
of Classification, Part 12: Building the Class Hierarchy”, in
Journal of Object Technology, vol. 3, no. 3, May-June 2004,
pp. 13-24. http://www.jot.fm/issues/issue_2004_05/column2
|