1 INTRODUCTION This is the seventh article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. So far, we have built up a model of objects as simple records, which are instances of corresponding record types [1]. Initially, we took the seemingly attractive view that a programmer's We shall discover circumstances in which a type system based on subtyping
breaks down, providing less than useful information. Object-oriented
languages like Smalltalk and Eiffel exhibit sophisticated, systematic
kinds of behaviour which cannot adequately be described in terms of
types and subtyping. By appealing to natural notions of classification
in biology, we shall demonstrate the extent to which the subtyping model
fails to capture the intuitive notion of a class. In this article, we
shall define the notion of class formally, and prove that it is more
than just a type. To understand this, we will need to extend our formal
model to include type polymorphism. This requires the 2 THE PROBLEM OF RECURSIVE CLOSURE There used to be a popular rhyming couplet that joked about the terminology used in the biological classification of the animal kingdom:
While every species has young of its own kind, the biologists seemed to have left some holes in the taxonomy, surely an oversight! What is more disturbing is that most object-oriented languages will assert that Dog, Cat and Guppy, by virtue of being kinds of Animal, all mate with an Animal and produce an offspring, which is (you guessed) an Animal, something forced upon recursive types by the rules of subtyping. To see why, let us express the reproducing behaviour of all animals (ignoring litters of offspring) formally as: This defines Animal as a recursive record type whose methods (most of which are not shown) include the showing that an Animal mates with an Animal and produces an Animal offspring, suitably capturing the general notion. Intuitively, we should like to introduce the Animal subclasses Dog, Cat (and even Guppy), such that these creatures all mate with, and produce young of their own kind, in the uniformly specialised style: We might expect Cat and Dog to be subtypes of Animal; however this is not the case, since they both redefine the signature of the This ensures that Dogs produce puppies, but still allows a Dog to mate with any kind of Animal, which seems intuitively wrong, but is formally correct by the rules of subtyping. The Animal type declared that its In general, recursion interacts poorly with subtyping. If any type T has a method that is This is one reason why redefined methods cannot change their types in languages with both type recursion and subtyping. Consider that in Java, the 3 THE PROBLEM OF TYPE LOSS A different but related problem arises when recursively-typed methods are inherited and invoked in a subtype object. So far, we have not modelled the notion of inheritance in any detail, but let us invent a simple rule to create subtypes by record extension [2]. Since records are merely maps from labels to values, and maps are really just sets, we can combine records using set union. Let us assume that we wish to define a hierarchy of numeric types, and that the basic Number type provides a primitive notion of addition: after unrolling the recursion. We can seek to derive other numeric types by extending this, yielding for example the Natural, Integer, Real and Complex numbers. In particular, the Integer type offers a full range of arithmetical methods: This defines the Integer type by extending the Number type with a record of additional fields, and then fixing the recursion. After unrolling Number to yield the corresponding record type, we can compute the union of fields, yielding the recursive record type: after unrolling the Integer recursion. Curiously, while the locally declared methods are all typed in terms of Integer, the inherited fails even to typecheck! This is because the sub-expression Type downcasting is typically considered a last resort, a dirty trick to be used on occasions when the natural type system doesn't help. Here, we have shown how type downcasting has to be used systematically all the time to overcome deficiencies in the type system. This is a strong indicator that subtyping is not the most appropriate formal model for object-oriented languages. Instead, we need a more expressive type system. 4 QUANTIFICATION OVER TYPES Working backwards from the desired goal, it seems that our intuitive notion of classification requires a type system in which recursive types can have methods that are closed over their own type, but which are nonetheless related to each other in some systematic way. We want to be able to support families of related types that behave in similar ways, such as the numeric types which all provide addition: and somehow be able to assert that these all belong to the which says that “all types .” This is still not quite right, since we want Universal quantification was adopted independently by Girard [4] and Reynolds [5] as a way of introducing 5 SIMPLY-TYPED AND POLYMORPHIC CALCULUS A zero-order system has no variables, but only sets of values. A first-order system has functions, whose bound variables range over simple values. A second-order system has functions, whose bound variables range over both values and simple types. The -calculus is the Recall that "x.x to anything, such as integers, characters or even other functions (in which case we would have a We may attach simple types to the function's arguments in the The difference here is that the types of the formal argument and the actual value must match, otherwise the application is deemed illegal, a type error. We can say that this identity is a In the Identity now expects a type argument: , since it applies to any type and returns a result of the same type. 6 GENERIC OBJECT TYPES This kind of construction can be used to extend our formal model of object types and allows us to define polymorphic types (generic, or templated types). Let us start with a As before, * is a recursive placeholder for the eventual IntegerStack. We may modify this definition to create a polymorphic type if we replace occurrences of Integer by a type parameter. We must introduce the parameter at the head of the type definition: Here, will be bound to some actual type. To see how this works, we can apply Stack to the Integer type ( and the result we obtain is identical to the IntegerStack type from above, after substituting {Integer/} in the record body. This is interesting, because we may apply Stack to any type we fancy, such as Stack[Character] or Stack[Boolean], creating specific instantiations of the polymorphic type. There is no restriction on the actual element type we could supply, so this kind of polymorphism is sometimes known as which we read as: "for all types ", and similarly for the other methods. Clearly, the generic Stack type expresses something about a family of related Stack-types, but this is still not the notion of 7 FUNCTION-BOUNDED QUANTIFICATION It was Cook [6, 7] who first realised that in order to model a class as a polymorphic family of related types, the key lay in making the self-type flexible, so that it could refer to a different actual type in every member of the type family. In an earlier article [1] we introduced Type generators can be used, exactly like type functions above, to create different instantiated versions of a parameterised record type. To see how this works, we revisit the Number type, but this time express it as a type generator, in which the self-type is not recursively fixed, but is a parameter introduced by : GenNumber is a generator for a family of related record types which have the general structure of numbers with a plus method. To show this, we can apply GenNumber to other numeric types, and this has the effect of adapting the self-type *, which is substituted by whatever type-argument we supply: This looks promising, in that we are able to construct record types with a The ideal type for an Integer is a recursive type whose methods are all closed over its own type, which we can unroll to a record type expressed in terms of Integers: Although this type can never be a subtype of the recursive Number which we can demonstrate by unrolling Integer to a record (on the left-hand side) and then evaluating GenNumber[Integer] (on the right-hand side) and comparing the two records: This satisfies the record subtyping rule [2]. The left-hand side contains more fields than the right-hand side, a simple case of record extension. It turns out that all the other numeric types (with more methods than Number) can be shown to enter into a similar relationship with a suitably-adapted version of the generator, for example: and it follows intuitively that any type
satisfying:
<: GenNumber[]
belongs to the family of numeric types which share at least the plus-method.
From this, Cook realised that a class is a polymorphic family of types
that satisfy a constraint, or meaning "all those types which are subtyes of the adapted GenNumber generator". What is unusual about this special kind of quantification is that the parameter appears on both sides of the <: subtyping constraint; but it turns out that this is exactly what is necessary to express the notion of a family of recursively closed types that have a shared minimum structure. A suitable polymorphic type for the The constraint ] is known as a function bound, or F-bound for short. F-bounded quantification was a revolutionary discovery, because it captured exactly the kind of polymorphism present in object-oriented languages, in which methods apply to families of types sharing a minimum common structure. 8 CONCLUSION We have formally defined the notion of class, using Cook's F-bounded quantification to express the idea that a class is and even satisfies natural intuitions about biological classification in which animals reproduce their own kind. Mathematics, as someone once said, is pure poetry. REFERENCES [1] A J H Simons, “The theory of classification, part 3: Object encodings and recursion”, in [2] A J H Simons, “The theory of classification, part 4: Object types and subtyping”, in [3] A J H Simons, “The theory of classification, part 5: Axioms, assertions and subtyping”, in [4] J-Y Girard, “Interpretation fonctionelle et elimination des coupures de l'arithmetique d'ordre superieur”, [5] J Reynolds, “Towards a theory of type structure”, [6] W Cook, “A denotational semantics of inheritance”, [7] P Canning, W Cook, W Hill, W Olthoff and J Mitchell, “F-bounded polymorphism for object-oriented programming”,
Cite
this article as follows: Anthony J.H. Simons: “The Theory of
Classification. Part 7: A Class is a Type Family”, in |