1 INTRODUCTION This is the fourth article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. The "Theory of Classification" will explain the behaviour of languages such as Smalltalk, C++, Eiffel and Java in a consistent framework, modelling features such as classes, inheritance, polymorphism, message passing, method combination and templates or generic parameters. So far, we have covered the foundations of type theory and symbolic calculi. Along the way, we shall look at some further theoretical approaches, such as F-bounds and matching. In this article, we focus on the fundamental notion of subtyping. Figure 1: Dimensions of Type Checking Previous articles introduced the notion of type [1], motivated rules for constructing types [2] and compared different formal encodings for objects [3]. We consolidate here on the 2 OBJECTS AND OBJECT TYPES In the following, we refer to In the calculus, we define things using "name = expression" to associate names with equivalent values (or types, below). The names are simply convenient abbreviations for the associated definitions. Simple objects may be written directly in the calculus, for example an arbitrary instance may be defined as a literal record: The type of an object follows automatically from the types of the values used for its fields. A suitable corresponding type for the instance above may be given as:
Object types are modelled in the calculus as record types, that is, records whose fields consist of labels which map to types (note the use of ":" to indicate "has the type", rather than "" meaning "maps to the value"). We always use the term 3 RECURSIVE OBJECTS AND OBJECT TYPES Objects are frequently recursive, where they refer to their own methods, via in which self): and then constructing aPoint from this generator, using the fixpoint combinator Since it is inconvenient to have to keep on appealing to this construction in every recursive definition, we use the -convention to denote recursive definitions directly. Recursive object types may also be denoted using this convention. A suitable corresponding type for the instance above is given as: in which is the function's type parameter): and then constructing the recursive Point type using In the rest of this article, we will use the -convention throughout for recursive objects and recursive types. In later articles, object generators and type generators will acquire a new importance in the definition of the notion of class. It is theoretically sound to use 4 SIMPLE SUBTYPING Object-oriented languages take the ambitious view that, in the universe of types, all types are
These authors all identify substitutability with subtyping. In fact, subtyping is just one formal approach which satisfies the principle of safe substitution - there are other more subtle and equally formal approaches, which we shall consider in later articles. Henceforth, we shall use "<:" to mean "is a subtype of". The notion of subtyping derives ultimately from subsets in set theory. In exactly the same way that: x : X ("x is of type X") can be interpreted as x X, by the definition of subsets: While this translation works immediately for primitive types, the object substitutability principle is couched in terms of 5 FUNCTION SUBTYPING For a substitute object y : Y to behave exactly like the original object x : X, then every method of X must have a corresponding method in Y that behaves like the original method of X. If we are only interested in Consider a method f of X, which we shall call fX : D - f
_{Y}must be able to handle at least as many argument values as f_{X}could accept; we express this as a constraint on the domains (argument types): D_{Y}D_{X}; and
- f
_{Y}must deliver a result that contains no more values than the result of fX expected; we express this as a constraint on the codomains (result types): C_{Y}C_{X}.
A helpful way to think about these obligations is to consider how a program might fail if they were broken. What if f From this, we can motivate the function subtyping rule, which expresses under what conditions a function type D "If the domain (argument type) D The function subtyping rule is expressed formally using a single argument and result type. To extend this to methods accepting more than one argument, we recall the fact that a single type variable in one rule may be deemed equivalent to a product type in another rule [2]. In this case, the contravariant constraint applies to the two products as a whole, and so we need a product subtyping rule to break this down further: "The product type S 6 RECORD SUBTYPING The last piece of the subtyping jigsaw is to determine under what conditions a whole object type Y is a subtype of another object type X. Recall that object types are basically record types, whose fields are function types, representing the type signatures of the object's methods. According to Cardelli and Wegner [6], one record type is a subtype of another if it can be coerced to the other. For example, a Person type with the fields:
might be considered a subtype of a DatedThing type having fewer fields:
because a Person can always be coerced to a DatedThing by "forgetting" the extra surname and married fields. Intuitively, this also satisfies the LSP, since a Person instance may always be used where a DatedThing was expected; any method invoked through a DatedThing variable will also exist in a Person object. This motivates the first part of the record subtyping rule ( "If there are distinct labels ai, then a longer record constructed with n fields, having the types a Defining new types of object by extension is clearly a common practice in object-oriented programming languages. However, there is also the possibility of overriding some methods in the extended object. What requirement should we place on a record type if
is intuitively a subtype of the more general:
because the set of all PositiveCoordinates is contained within the set of all Coordinates (the positive subset occupies the upper right quadrant of the Cartesian plane). This motivates the second part of the record subtyping rule ( "A record that has n labelled fields of the types a "A longer record type {a To complete the picture, we must say a little about recursion and subtyping. The subtyping rules given above depend on having complete type information about all the elements that make up the types. One of these elements could be the self-type, in a recursive type. We cannot make definite assertions about subtyping between recursive types, unless we first make some assumptions about the corresponding self-types. Cardelli [7] expresses this (here slightly simplified, ignoring the context) as the rule: "If assuming that ." This rule sets up the relationship between the self-type variables and the types which depend on them. In a later article, we shall revisit the interactions between recursion and subtyping, which proves to be quite a thorny problem for object-oriented type systems. 7 CONCLUSION We have reconstructed the classic set of subtyping rules for object types, including rules for recursive types. These rules have the following impact on object-oriented languages that wish to preserve subtyping relationships. A class or interface name may be understood as an abbreviation for the type of an object, where the type is expressed in full as the set of method type signatures owned by the class (or interface). A subclass or derived interface may be understood as a subtype, if it obeys the rule of record subtyping. In particular, the subtype may add (but not remove) methods, and it may replace methods with subtype methods. A method is a valid replacement for another if it obeys the function subtyping rule. In particular, the subtype method's arguments must be of the same, or more general (but not more restricted) types; and its result must be of the same, or a more restricted (but not more general) type. Very few languages obey both the covariant and contravariant parts of the function subtyping rule (Trellis [8] is one example). Languages such as Java and C++ are less flexible than these rules allow, in that they require replacement methods to have exactly the same types. Partly, this is due to interactions with other rules for resolving name overloading; but also it reflects a certain weakness in the type systems of languages based on subtyping, which we will start to explore in the next article.
1 Apart from subrange types in Pascal, which were considered compatible with their base types. REFERENCES [1] A. J .H. Simons, [2] A. J. H. Simons, [3] A. J. H. Simons, [4] K. Bruce and J Mitchell, [5] B. Liskov, [6] L. Cardelli and P Wegner, [7] L. Cardelli, [8] C. Schaffert, T Cooper, B Bullis, M Kilian and C Wilpolt,
Cite this column as follows: Anthony J. H. Simons: "The Theory of Classification, Part 4: Object Types and Subtyping", in |