The Theory of Classification

Part 16: Rules of Extension and the Typing of Inheritance

Anthony J H Simons, Department of Computer Science, University of Sheffield, U.K.

COLUMN

1 INTRODUCTION

This is the sixteenth article in a regular series on object-oriented type theory for non-specialists. Earlier articles have built up -calculus models of objects [1], classes [2], inheritance [3, 4] and generic template types [5]. Classification describes the way in which typed objects fit into a hierarchy of classes, which nest inside each other [3]. Inheritance is the short-hand mechanism for definining a subclass by extension from another class, specifying only the additions and modifications to the base class [4]. Previously, we have modelled the inheritance of type [3] and implementation [4] and combined both of these in a model of typed inheritance [6]. We showed how short-hand inheritance expressions can be simplified to yield a canonical subclass definition that is type compatible with the base superclass from which it was derived. Further aspects of inheritance have included method combination [7], mixin inheritance [8] and inheritance among generic classes [5].

The current article examines the mechanism of inheritance in more detail, looking at the constraints on what may or may not be added to a class during inheritance. Most object-oriented languages have restrictions on the types of overriding methods, to ensure that the resulting subclass is still type compatible with the superclass. This requires more precise rules about the typing of , the inheritance operator. Previously, we thought of as a polymorphic map override operator that could combine two maps of any types, irrespective of the types of the fields. We now require inheritance to be properly typed, in the second-order F-bounded -calculus, so that we can restrict the kinds of extension that are deemed legal. The extended type resulting from inheritance is shown to be an intersection type [9].

2 MERGING RECORDS AND RECORD TYPES

In the Theory of Classification, we model objects as simple records of functions, representing their methods, and object types as the corresponding record types, containing the signatures of these methods. This leads to a model of inheritance based on record union with overriding, denoted by the operator . Intuitively, this creates a longer record by combining two shorter records:

in which the derived result contains the union of all the fields of base and extra, but fields from extra are preferred over any identically-labelled fields from base in the result [3]. This right-handed preference of models the notion of overriding in object-oriented languages, in which the derived class may replace some of the methods present in the base class with redefined versions supplied in the extra extension.

So far, we have always used in a context where the replacement fields have the same types as the original versions. As a consequence, we have been able to construct Derived record types using simple set union to merge the two sets of type signatures in the corresponding Base and Extra type-records:

We think of objects as sets of maplets from labels to functions, and object types as corresponding sets of maplets from labels to function types. It is reasonable to think of the merger of two record types as the set union of their respective sets of maplets, since any fields with identical labels will also have identical types.

Merging in the subtyping model

In the subtyping model [10], we must consider the possibility that fields of different types may be merged. This is because the record subtyping rule allows fields to be replaced by subtype fields. In this case, we cannot use to merge the record types. The following is a plausible definition of a record subtype by extension:

That is, we wish to obtain the subtyping relationship Car <: Vehicle. According to the record subtyping rule [10], this requires Car to have at least as many fields as Vehicle (it has one more) and requires any replacement fields to be subtypes. The field home : Location is replaced by home : Garage, so the subtyping condition is only met if Garage <: Location, which is plausible. We use , to combine the two record types, because we wish the right-hand types of any common fields to be retained in the result.

The above model could be used in languages like Java and C++, which are based on types and subtyping. However, in these languages, a replacement method is typically expected to have exactly the same type as the method it replaces. In more recent versions of C++, the type of this is allowed to be more specific in the result of a replacement method. The overriding rules of Trellis are closest to the subtyping model, allowing both method argument and result types to change in accordance with the function subtyping rule [10].

Merging in the subclassing model

In the subclassing model, we merge generators and type generators, rather than objects and object types [3, 4]. A curious thing happens when merging records according to the (F-bounded) parametric model: the parameters are instantiated or replaced before any record combination occurs. This means that references to different parametric types on the left and right hand sides become unified before record combination. As a result, record combination always merges records whose common fields have the same types. The following illustrates a simple type generator example, in which a (somewhat simplified) Integer-class generator is defined by extension from a Number-class generator:

We obtain the subclass relationship: ]. This is achieved by making sure that GenInteger has more fields than GenNumber and that the common fields are typed in terms of parameters which can be unified before record combination occurs. In the inner type-record combination expression, GenNumber[] causes the substitution of {} in the body of GenNumber, such that the record types on both sides of the union have field types which refer to the self-type uniformly as ; and, in particular, the common field equal has the same type: Boolean on each side of the union.

In fact, it is always the case, in the subclassing model, that self-type parameters are unified before record combination. Likewise, generic type parameters may be unified before combination [5] (see also 5.4 below). So, the simpler union of type-records appears to be all that we need in the subclassing model.

3 SUBSETS, SUBTYPES AND TYPE INTERSECTIONS

Throughout this series, we have been careful to distinguish the notation for the subset and subtype <: relationships. This is because the relationship between the two depends on whether we are thinking about the set of values in a concrete type, or the set of type signatures in a record type. These two set-theoretic constructions are different, and they correspond to different subtyping relationships.

Concrete versus abstract representation

The fundamental relationship is that types may be modelled as sets. When we assert that an element is of a particular type, this is equivalent (in the model) to asserting that the element is a member of a particular set:

From this it follows that a subtype may be modelled as a subset:

In the universe of types, we want to show that if x : S and S <: T, then x : T also. In the universe of sets, x T follows from x T, by the definition of the subset relationship:

This is the fundamental relationship, which applies to types defined concretely as sets. When we move to defining types abstractly, in terms of their syntactic signatures, then the relationship is different. A record type with more signatures denotes a subtype of a record type with fewer signatures. For example, if the following record types are defined:

then it is clear that S is the larger record type and contains the signatures of T, which we express as T S in the universe of signature-based types. However, it is also clear that S denotes a subtype of T, because every object that satisfies the interface S will also satisfy the interface T. The record subtyping rule (see [10]) expresses this fact.

Intensional versus extensional definition

There are grounds for confusion here: in one model, we say: S T; but in the other model, we say: T S. The difference is that, in the first model, we are comparing sets of values, but in the second model, we are comparing sets of type signatures. To see how these both ultimately reflect the same subtyping relationship, we have to distinguish the intensional and extensional definitions of a type.

The extension of a type is the enumeration of the set of elements that it contains, for example, the Boolean type has the extension: {false, true}
The intension of a type is the enumeration of the set of properties that characterise the type, for example, the (existentially defined [1]) Boolean type has the intension:

followed by a set of axioms defining the meanings of these operations.

To unify the concrete and abstract views of a type, it is easiest to imagine the extension of the type, that is, the set of values (or objects) populating the type. This is the usual view adopted in type theoretical treatments. In object-oriented programming, we usually characterise a class intensionally, that is, by its properties (type signatures). From this, we have to imagine the extension of the class, that is, the possible set of objects which could populate it.

Intersection types

Here, we try to establish the relationship between intensional (signature-based) types and extensional (value-based) types. Earlier, we modelled type extension as the union of type-records: Derived = Base Extra. In terms of sets of signatures, this means that Base Derived, that is, both Base and Extra contain a subset of the signatures of Derived, which is a longer record type. By the record subtyping rule [10], a longer record type with more field signatures is a subtype. According to this, the direction of the subtyping relationship is contravariant with the direction of the signature subsets: Derived <: Base and also Derived <: Extra. This is a fundamental property of type hierarchies: the larger the interface, the smaller the set of objects which may satisfy it.

From this, we may reason about the extensions of each type. Instances of the Derived type may also be considered instances of the Base type (and instances of the Extra type), by the subtyping rule of subsumption. So, the extension set of the Base type is larger than that of the Derived type; likewise the extension set of the Extra type is larger than that of the Derived type. Since elements of the Derived extension are also members of the Base and Extra extensions, the membership of the Derived extension is precisely the intersection of the memberships of the Base and Extra extensions.

For this reason, the kinds of types created by merging signature-based types are sometimes known as intersection types. Instead of writing: Base Extra (in the world of signatures), we write: Base Extra (in the world of sets), to denote the intersection of the Base and Extra types. Much of the fundamental research on this was done by Compagnoni and Pierce in the mid-1990s [9, 11]. They developed a type system called “System ”, pronounced “System F-omega-meet”, a higher-order type system with intersection types.

4 CONSTRAINING THE INHERITANCE FUNCTION

Many object-oriented languages have strict rules about method overriding, during inheritance, because they wish to preserve type compatibility (either subtyping, or subclassing) in the derived type. In C++ or Java, any replacement method must have exactly the same type as the original method it replaces. This imposes a constraint on the inheritance function, which we should like to capture in the model. We shall try to capture this constraint in a general enough way that it will apply both to the first-order subtyping model of inheritance, as found in Java, and also in the second-order subclassing model of inheritance, which is a more appropriate general model for object-oriented programming, in which polymorphic classes and simple types are actually distinct notions.

The extend inheritance function

Inheritance is only well-defined if the Extra record provides fields whose types “merge” with the types of the Base record. This “merge” condition is expressed as a constraint between the two record-types in the following F-bounded, second-order definition of the inheritance function extend, which we shall now use in place of the earlier unconstrained map override operator:

This definition says that: “extend takes two type arguments, Base and Extra, where Extra must satisfy the type-merge condition with Base, then two record arguments, base : Base and extra : Extra, and constructs a result by merging the two records, which has the intersection type (Base Extra). The result is a map of label-value pairs, such that the set of labels is the union of the domains of base and extra, and the values are preferentially taken from extra, if the label is present in extra, otherwise taken from base.” (Note that base(label) maps to the value opposite label in the base map [4]).

Readers will note that the body of extend is identical to the body of in earlier articles [4]. These two functions are essentially the same, except that extend is now properly-defined in the second-order -calculus, with type arguments (Base and Extra) as well as value arguments (base and extra). The type arguments were conveniently omitted from the earlier definition of , which we imagined could be applied directly to two record values. We can retrospectively define the operator in terms of extend:

This creates a simply-typed version of is just a short-hand for extend with two types already supplied.

The M type-merge condition

The all-important type-merge condition is a constraint that restricts the record-types that are allowed to be substituted for the Extra type argument. Although this is a rather special condition, constructed for the purpose of typing inheritance, it is syntactically no different from other kinds of restriction, such as the F-bound: to be a subtype of some generator expression. Here, we restrict Extra to range over those record types whose field-types enter into a particular relationship with the types of the Base fields. The constraint is defined as follows:

This says: “For all types Base and Extra, the type-merger condition Extra Base is defined as being satisfied if, for all common fields in Base and Extra with identical labels, the corresponding types are also equal”.

For this, we must assume that the notion of “type equality” is well-defined. In full, this might be expressed by a whole set of rules. For the model of inheritance used in the Theory of Classification, we require the following kinds of type equality:

where t is a simple type, is a type parameter, and S, T are metavariables ranging over simple types and type parameters. (Type rules sometimes use metavariables like this to save having to repeat the same rule for simple types and parametric types).

Constrained typed inheritance

The result of extend is well-defined if Extra Base (“Extra merges with Base”). This rule constrains inheritance just enough to behave exactly like typed inheritance in Java, but disallows certain other kinds of inheritance For example, the Trellis-style of inheritance in section 2.1 is now ruled out by the type-merge condition, because a field is replaced by a field which has a different type. The Base and Extra records have the types:

and the common labels in dom(Base) {home}. However, when we compare the corresponding types, we find that: Base(home) Location and Extra(home) Garage. So, we establish that common field-types are not identical: Base(home) Extra(home), and therefore that is not satisfied. To pass the type-merge condition, the Extra record would have to redefine the home field with the same type: home : Location, as in Java.

By deliberate design, the same type-merger rule allows the kind of unions of type-records we require for the merger of parameterised record types, which are used in the subclassing model of inheritance. Repeating the example from section 2.2:

The Base and Extra records have the following types, after the parameter substitution {}:

and the common labels in dom(Base) dom(Extra) {equal}. When we compare the corresponding types, we find that: Base(equal) Boolean, and: Extra(equal) Boolean. Intuitively, these two types are identical; formally we would need to appeal to the equality of two function-types (see 4.2) based on the identity of the two argument type parameters and the identity of the two simple Boolean result types. Ultimately, the condition is satisfied, so this is a legal extension.

5 VARIATIONS ON TYPED INHERITANCE

The standard “reference” model of inheritance consists of the extend inheritance function and the type-merger constraint. This allows a record to be extended only if overriding fields have the same types as in the original fields they replace. The resulting intersection type is always a record-type consisting of the union of the signatures of the Base and Extra record types, since common fields have the same types. We now consider a number of object-oriented languages and examine how their models of typed inheritance differ from this reference model.

Inheritance in Smalltalk

Smalltalk is not strongly typed. However, certain rules are still observed about inheritance. A method can only override another method if its untyped “signature” is structurally similar, for example, the method at:put: always has the structural form:

Any method in a descendant class must have the same name and structural form in order to override this method. So, the “arity” of method arguments and results is always preserved, although nothing can be said about the individual types of each argument or result. Smalltalk can distinguish product types from basic types , but apart from this, all basic types (and parameters, considering that self has an F-bounded parametric type) are indistinguishable, and so must be considered equivalent. So, for Smalltalk, we should have to redefine the notion of type equality to allow s = t = for all simple types s, t and all parameters .

Inheritance in Trellis

The type-merger condition above is too restricting to describe inheritance in Trellis. Trellis allows full subtyping in its overriding rules, that is, methods may be replaced by other methods whose arguments have more general types and whose results have more specific types, according to the contravariant and covariant parts of the function subtyping rule. To handle Trellis, we should modify our definition of :

This now allows field types in the Extra record to be subtypes of common fields in the Base record. The resulting intersection type Base Extra may contain finer intersections of field types, for example, the extension of Vehicle in 2.1:

requires the nested intersection: Location Garage = Garage. (Constructively, Garage is the largest type which is a subtype of both Garage and Location).

Inheritance in Java and C++

The original type-merger condition describes exactly the constraint on inheritance in Java, in which all replacement methods must have exactly the same types as the methods they replace. This strict equality nearly describes the condition in C++, apart from the relaxation that applies to returned self-types. We can express this relaxation as:

saying that replacement methods must have the identical types unless they return the self-type, in which case methods of the function type: Base must be replaced by methods of the function type Extra). The resulting intersection type Base Extra will be a subtype of Base.

C++ may also have type parameters in its method signatures, if the template class mechanism is used (and so will Java, from version 1.5 onward). The notion of type equality must therefore allow for the comparison both of exact types and type parameters (see 4.2).

Inheritance in Eiffel

The overriding rules of Eiffel allow methods to be replaced by methods whose arguments and results are both uniformly specialised. This is not legal within a simple subtyping regime; but Eiffel is not based on the subtyping model of inheritance. Elsewhere, Eiffel implicitly evolves the self-type (the type of current) under inheritance and anchors other types to the self-type, especially in binary methods¹ such as the infix “+” method in the Numeric class:

Because of this, it is tempting to think of Eiffel as following the F-bounded subclassing model of inheritance, in which “like current” is actually a parametric type ]). Eiffel also has generic and constrained generic parameters:

which are exactly the same notion as F-bounds. Think of the constrained type parameter T as a parametric type: <: GenComparable[]). So, it makes most sense to think of Eiffel as belonging to the second-order family of languages, along with Smalltalk and Flavors.

This being the case, the reference definition of type-merge is adequate to capture Eiffel’s model of inheritance. You simply have to imagine that all Eiffel class-types are in fact parametric types, which are only fixed when object instances are created. The model of inheritance unifies all the type parameters before combining the records. We illustrate this with a parametric version of the example from 2.1 above:

The subclass generator GenCar reintroduces all the parametric types used within the class, and substitutes these new parameters inside the body of the parent generator, through the application: GenVehicle[] before merging this adapted record with the record of extra methods. So, all common fields have the same types before record combination is computed, and the simple union of signatures is all that is required. The notion of type equality must allow for equality of simple types (such as Eiffel’s Integer and Real types) and equality of type parameters for all class-types (see 4.2). Simons first proposed a unified parametric model of Eiffel’s type system in 1995 [12], in which the self-type, anchored types, constrained generic types and ordinary class-types were all modelled using F-bounded parameters.

6 CONCLUSION

In this article, we have revisited the notion of typed inheritance. The Theory of Classification describes two models of inheritance, one a first-order model based on subtyping (Java, C++) and the other a second-order model based on subclassing (Smalltalk, Eiffel). Objects are modelled as records, or maps from labels to methods, so inheritance may be modelled as map union with override. Previously, the classical function override operator was used without any constraints on the types of the records being combined. Here, we have introduced an F-bounded second-order definition of the inheritance function, called extend, with a constraint on the type of extension that may legally be combined with any record.

We showed how, in the reference model, the constraint merely has to ensure that replacement fields have the same types as the fields they replace. This works for Java-style inheritance (first order) and also for Eiffel-style inheritance (second-order) in which field types may be parametric as well as simply-typed. Variations on this allow replacement fields to be subtypes (Trellis), or a mixture of type-equal and subtype fields (C++). One observation emerging from this is that the ability to replace fields with subtype fields is not a frequent requirement in object-oriented languages. The subclassing model of inheritance only requires type-equality, because all the field types are unified prior to combination, whether by parameter unification [3], or instantiation [5]. Simons and Bruce were the first to note the poor match between simple subtyping and natural models of inheritance [13, 14]. This is what originally motivated the Theory of Classification.

Footnotes

¹ A binary method is one which accepts an argument of the same type as self. It is binary in the sense that it deals with two objects of the same type.

REFERENCES

[1] A J H Simons, “The theory of classification, part 3: Object encodings and recursion”, in Journal of Object Technology, vol. 1, no. 4, September-October 2002, pp. 49-57. http://www.jot.fm/issues/issue_2002_09/column4

[2] A J H Simons, “The theory of classification, part 7: A class is a type family”, in Journal of Object Technology, vol. 2, no. 3, May-June 2003, pp. 13-22. http://www.jot.fm/issues/issue_2003_05/column2

[3] A J H Simons, “The theory of classification, part 8: Classification and inheritance”, in Journal of Object Technology, vol. 2, no. 4, July-August 2003, pp. 55-64. http://www.jot.fm/issues/issue_2003_07/column4

[4] A J H Simons, “The theory of classification, part 9: Inheritance and self-reference”, in Journal of Object Technology, vol. 2, no. 6, November-December 2003, pp. 25-34. http://www.jot.fm/issues/issue_2003_11/column2

[5] A J H Simons, “The theory of classification, part 13: Template classes and genericity”, in Journal of Object Technology, vol. 3, no. 7, July-August 2004, pp. 15-25. http://www.jot.fm/issues/issue_2004_07/column2

[6] A J H Simons, “The theory of classification, part 11: Adding class types to object implementations”, in Journal of Object Technology, vol. 3, no. 3, March-April 2004, pp. 7-19. http://www.jot.fm/issues/issue_2004_03/column1

[7] A J H Simons, “The theory of classification, part 10: Method combination and super-reference”, in Journal of Object Technology, vol. 3, no. 1, January-February 2004, pp. 43-53. http://www.jot.fm/issues/issue_2004_01/column4

[8] A J H Simons, “The theory of classification, Part 15: Mixins and the superclass interface”, in Journal of Object Technology, vol. 3, no. 10, November-December 2004, pp. 7-18. http://www.jot.fm/issues/issue_2004_11/column1

[9] A Compagnoni and B Pierce, “Multiple inheritance via intersection types”, Technical Report ECS-LFCS-93-275, University of Edinburgh, (Edinburgh: LFCS, 1993).

[10] A J H Simons, “The theory of classification, part 4: Object types and subtyping”, in Journal of Object Technology, vol. 1, no. 5, November-December, 2002, pp. 27-35. http://www.jot.fm/issues/issue_2002_11/column2

[11] A Compagnoni, “Subtyping in is decidable”, Technical Report ECS-LFCS-94-281, University of Edinburgh, (Edinburgh: LFCS, 1994).

[12] A J H Simons, “Rationalising Eiffel’s type system”, Proc. 18th Conf. Tech. Object-Oriented Lang. and Sys., eds. R Duke, C Mingins and B Meyer, (Melbourne : Prentice Hall, 1995), 365-377.

[13] A J H Simons, “A language with class: The theory of classification exemplified in an object-oriented language”, PhD Thesis, University of Sheffield (Sheffield, Department of Computer Science, 1995).

[14] K B Bruce, A Fiech and L Petersen, “Subtyping is not a good “match” for object-oriented languages”, Proc. European Conf. Obj-Oriented Prog. 1997, pub. LNCS 1241, (Jyväskylä: Springer Verlag, 1997) 104-127.

About the author

Anthony Simons is a Senior Lecturer and Director of Teaching in the Department of Computer Science, University of Sheffield, where he leads object-oriented research in verification and testing, type theory and language design, development methods and precise notations. He can be reached at a.simons@dcs.shef.ac.uk.

Cite this column as follows: Anthony J.H. Simons: “The Theory of Classification, Part 16: Rules of Extension and the Typing of Inheritance”, in Journal of Object Technology, vol. 4, no. 1, January-February 2005, pp. 13-25. http://www.jot.fm/issues/issue_2005_01/column2

Previous column

Next column