The Theory of Classification

Part 11: Adding Class Types to Object Implementations

Anthony J H Simons, Department of Computer Science, University of Sheffield, U.K.

COLUMN

1 INTRODUCTION

This is the eleventh article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. In the Theory of Classification, we have so far considered the typeful aspect of classes [1, 2] and their implementation aspect [3, 4] separately. We have been concerned to point out how the notion of classification has a fully formal interpretation [2], in which, at the typeful level, a class is distinct from a type [1]. Likewise, we have explored the implementation level, in order to understand the operation of inheritance on objects [3] and give a precise meaning to the pseudo-variables self [3] and super [4] in different object-oriented languages.

Eventually, we must link the type and implementation aspects together, since this is how type rules are properly presented [5]. In a type rule, the aim is to be able to derive the resulting type of some expression, given the types of the values that make up this expression. We would like to show, for example, that the result of extending an object with extra fields is itself a well-typed expression. To do this, we must somehow attach the class-type information to the object-values. Furthermore, we introduced a special operator to model inheritance [3]. We must also show that inheritance itself is a well-typed operation. This will involve examining the types of the objects that we pass as operands to , which was defined in a polymorphic way.

We started with a calculus of class-types [1, 2] and developed a separate, but related, calculus of object-values [3, 4]. In this article, we seek to develop a calculus of typed objects, in which the type information is attached to the object information. With this, we shall be able to infer the types of object definitions created via inheritance.

2 LINKING VALUES AND SIMPLE TYPES

To introduce the new typed calculus, we shall review the different -calculus styles presented so far. Imagine a functional language (like Lisp, ML, or the functional subset of C) in which we want define our own negate function to flip the sign of a number. In the untyped -calculus, we can define negate as follows:

since it takes an argument x and returns a body, in which x is negated using primitive minus “-” (which we assume exists already). What is the type of this function? So far, this is not specified – we could be negating an Integer, a Real or even a Natural (unsigned!) number, or worse still, something which is not even a number. Let us further assume that we want negate to apply to Integers, rather than any of the other types. To assert this, we give negate a type signature and attach type information to the implementation of the function:

The type declaration says that negate takes an Integer and returns an Integer result. On the next line, the implementation is given in the simply typed -calculus, in which (x : Integer) declares again that the argument x is of the Integer type. In this style of writing, we don’t bother to annotate the type of the function body explicitly, since the result type was declared beforehand. It could also be inferred using other type rules for “-”, which are not shown here.

The main thing to note is the style of declaration. A typed function is always declared by giving its type signature, then defining its implementation, in which type information is attached to the argument variables. We first discussed this in the earlier article [1].

3 LINKING VALUES AND POLYMORPHIC TYPES

We now want to generalise the typing of negate, to indicate that it can flip the sign of all kinds of numeric types. The polymorphic typed -calculus allows us to define functions that accept both type- and value-arguments. We could define a very general version of negate as follows:

This is rather more general than we actually want. In the type signature, it says that negate is defined for all types , then accepts an argument in this type and returns a result of the same type. We shall fix this later, so that negate only applies to signed, numeric types. For the moment, note how the implementation is prefixed with an extra type parameter: . This says that negate accepts an actual type argument, followed by a value in this type. This means that we have to apply negate to two arguments, first a type, then a value of that type:

To distinguish type-application from value-application, we conventionally use [] to supply type arguments and () to supply value arguments. Type-application is equivalent to instantiating the type of the function. This follows naturally from the rules of -calculus: by applying negate to a type Integer, you substitute the actual type for the parameter: {Integer/} in the function body. The body is everything to the right of . So, the value returned after type-application is identical to a simply-typed version of the function (like that shown above):

in which the type of x is now fixed. This typed function may now be applied to a value of the appropriate type.

In the theoretical model, we always have to supply the desired type of the function, before we can apply it to a value of this type. We cannot perform type-inference in the style: negate (3 : Integer), because this breaks the convention on the ordering of arguments in the declaration. The main thing to note is that the type parameter is always introduced before the value argument, so these arguments are always supplied in this order. We first discussed this idea in [1].

4 TYPE PARAMETERS AND KINDS

Since we are now dealing with a typed calculus, what is the “type” of the parameter also have a meta-type, known as a kind. This is the “next level up” in the type system. We could show the meta-type of variables like by introducing them in a style in which the kind is explicit: :: TYPE), to indicate that is a type parameter which can range over all types in the set TYPE. However, since the second order, polymorphic typed -calculus only has one main kind (the set of all simple types, TYPE), we shall later omit mentioning TYPE explicitly. For a discussion of orders of calculus, see the earlier article [1].

Above, we noted that we wanted to restrict the type of the negate function, such that it applied only to the signed, numeric types. This can be done by filtering the set of possible types in TYPE to those of interest. Let us assume that there is a type-filtering function Filter-Signed that returns true only if the type is a numerical, signed type. We can define a signed, numerical subset of all types:

This defines SIGNED as “all those types ] is true”. It should be clear that SIGNED TYPE. We can then express the type of negate as:

In the definition of negate, the type variable ranges only over those types in the SIGNED subset. Restrictions like this are extremely useful in object-oriented programming, where we wish polymorphic methods to apply only to certain sets of types. A set of types sharing some common structure is a class [1, 2] in our Theory of Classification.

5 GENERATORS USED AS CLASS TYPE-FILTERS

In earlier articles [1, 2] we introduced the notion of a function bound, often abbreviated to F-bound [6, 7], to describe a similar restriction. Literally, a bound means a restriction, and a function bound is a restriction expressed using a function. Let us define a type function, a record type generator, for a simple class of two-dimensional Cartesian Points:

This expresses the interface of a family of Point-like types that have at least the three methods x, y and equal. The generator parameterises the self-type , which eventually could stand for different types of Point, such as a Point3D [4] or a HotPoint (a selectable Point, see below). We can use this type generator as a filter to restrict the polymorphic application of these methods to only those types which could be considered at least “some kind of Point”.

Recall that the typeful notion of a class is a group of (possibly recursive) types sharing a minimum common structure. We may express the class of Points as: ]), because it restricts the types over which can range to those types which are a subtype of the instantiated generator GenPoint[]. Earlier, we found that an extended interface is a subtype [2, 8], so this captures precisely the object-oriented notion of families of object-types that share a minimum common set of methods. We may now give the methods x, y and equal a type signature which restricts their applicability to the class of Points:

These type signatures say that the methods are selected only from those types which satisfy the membership criteria of the Point class. Note in passing how the equal method is a binary method, accepting another argument of the same type as the owning object itself. We shall be interested to see how the type of a binary method evolves, when inheritance comes into play.

Formally, an F-bound is always expressed using a subtyping constraint: ]), for some type generator function G. For comparison with the previous section, this can be thought of as a type filtering constraint: ]), where F is defined as: F = ]).

6 LINKING OBJECTS AND CLASS-TYPES

We are about to define a typed object generator for a class of Points. We introduced type generators in [1] and object generators in [3]. This time, we are going to attach type information to the object generator for a Point instance at the co-ordinate <2, 3>. We proceed exactly as in the sections above, by first declaring the type signature, then giving the full typed definition:

At first sight, this may look rather daunting! In fact, it is no more complex than the style of typed definitions given above. To motivate this structure, we shall build up to it more slowly.

Recall that an untyped object generator [3] is a function of self, whose body is a record describing the method implementations of an object instance. Our first version of the generator (omitting all details of the actual methods) is:

If we wish to add types to this, we must prefix the value-argument , where stands for the type of self. We shall also attach the type explicitly to the self-variable. Our second version is a universally-typed generator:

in which still ranges over all types in the universe of types. We want to restrict so that it ranges over only those types in the class of self. To do this, we must have a separate type generator Gen, which has a type-shape matching the value-shape of the object generator, gen. We can then use it as a filter on the type parameter , giving a third, F-bounded version of the generator:

This is now in the same form as the typed object generator genAPoint, above. To see the correspondence, note how the second line in the definition of genAPoint introduces first a self-type parameter: ]), then the self-argument: ), followed on the third line by the record-body, representing the implementation of the Point instance. This follows the general form of second-order typed definitions: first, we introduce the type parameter, then the value parameter, then the body of the function.

The type signature for genAPoint also deserves some discussion. It says that genAPoint is well-defined for all types in the class of Points: <: GenPoint[]), and then that it accepts a value (ie an actual value for the self-argument) in the type ]. This does in fact accurately describe the type of the record body. If we supply some standard p:Point as the self-argument, we get a result with the type: GenPoint[Point]¹ . If we supply some more more specific hp:HotPoint as the self-argument, we get a result with the “truncated” type: GenPoint[HotPoint]. By “truncated”, we mean a type that looks like a HotPoint, but with only those methods that were listed in the Point-interface.

7 STRONGLY TYPED INHERITANCE

To study the workings of inheritance when types are added, we shall attempt to extend the typed object generator for a Point to yield a typed object generator for a HotPoint, a selectable kind of point. As before, we shall first provide a type generator for the HotPoint type (we shall need this later to express F-bounds). GenHotPoint can be defined by extension, based on the GenPoint type generator. The additional fields include the types of the new method selected and redefined method equal (which, in a HotPoint, must also compare selected states):

The simplified form of this type generator is well-formed, after computing the union of fields:

The only interesting consideration is what happens with GenPoint[]. This causes a substitution of type parameters in the body of GenPoint: {}, and has the consequence that all references to the self-type are uniformly changed to , before the union of fields is computed. This means that the identically-typed equal method type appears twice, once on either side of the operator, but only one copy is retained after the union.

We are now about to define a typed object generator for an instance of HotPoint, at the coordinate <2, 3> and whose selected state = true. We shall attempt to derive this by inheritance, in accordance with the model given in the earlier articles [3, 4]. This time, however, we shall be careful to attach type information, in the style presented above, to all parts of the definition. First, we give the type declaration, then the full definition of genAHotPoint:

This looks fabulously complicated! However, if you mentally put on one side the whole body expression inside the bold parentheses, the prequel leading up to it is in exactly the same form as all our other definitions. First, the type signature of genAHotPoint is given. Then, on the second line, its full definition is given, starting with the type parameter and value parameter self, followed by the body (everything contained within the bold parentheses).

Looking now at the body expression, this is exactly the construction we used to explain super-method combination in the previous article [4], except that types have now been attached to all value parameters. The body is a nested function application that first binds super, then performs a record combination using the operator [3]. We shall want to simplify this (viz evaluate the combination expression), to assure ourselves that we have in fact defined a suitable generator for a HotPoint instance. However, we must first establish whether the body expression is properly typed.

8 TYPE SOUNDNESS OF SUPER

We first want to satisfy ourselves that the binding of super is type-sound. From the previous article [4] we learned that super is an adapted form of the parent object, in which self-reference is redirected to refer to the child² . Does our typed model reflect this faithfully?

The type and binding of super

At the start of the body, the super variable is declared with the type: (super : GenPoint[]). So, the type of super is structurally “like” the type of the parent Point, except that, within this structure, the self-type is replaced by ] is a “truncated type” in which the self-type refers to a HotPoint, but which offers only those methods available to a Point. You can think of a generator as a mask, and GenPoint[HotPoint] “masking out” all the methods of HotPoint, that were not in the interface of a Point (viz in the body of the GenPoint-generator). This is the appropriate type to give to super, since it captures exactly the type of a “mofidied parent instance” in which self is redirected to refer to the child [4].

To understand what is happening inside the body expression above, notice how it consists of a super-function, (super : GenPoint[]).(…) which is applied to an object, denoting the value to bind to super, given right at the end of the body expression (mentally skip over the body of the super-function, which consists of the record-combination expression). This super-object is given at the end by the expression: genAPoint [] (self).

The next question we must ask is: does the super-variable receive an object-value of the right type? We need to work out the type of the expression: genAPoint [] (self) and see if this corresponds to something with the declared type: GenPoint[]. The super-object is clearly constructed from the typed object generator genAPoint, after supplying a type argument and a value argument self : . The result of this is a record, the body of the generator genAPoint (see section 6 above). The type of the result was declared in the type signature: <: GenPoint[]). ], which says that, by supplying and self : , we obtain a result having the type: GenPoint[]. This is exactly the type of object expected for super, above.

However, before we assume this happy outcome, we should check first whether it is actually type-safe to apply the generator to the type parameter , and value parameter self : Are these suitable types and values for this generator?

Rebinding type parameters

We will look at the type substitution first. The generator genAPoint was declared to be safe with all types satisfying <: GenPoint[] is simply a matter of substituting one type parameter for another: {}. All type parameters have the same kind, TYPE, so this should not be a problem. However, a more subtle thing is happening. By substituting {}, we are changing the restriction on the types which may instantiate the parameter. The parameters implicitly carry attached type constraints (from the F-bounds), so we have to worry about whether changing these makes a formal difference.

Although we cannot compare two type parameters directly, we can make a judgement about all the types which could possibly instantiate the respective parameters. Fortunately, it turns out that any type we could supply for will also satisfy the type constraint on . This is because of the point-wise subtyping condition between the two type generators:

which lies at the basis of the Classify rule in [2]. Because the type generators GenHotPoint and GenPoint stand in the right structural relationship, we can safely replace the self-type of the parent class by the self-type of the child class.

Rebinding value parameters

We will now look at the value substitution. The second argument in genAPoint [] (self) is a self-reference that refers to a HotPoint instance, with the type: self : . The generator genAPoint was originally declared to accept a value parameter having the type: self : <: GenPoint[ by a new parameter: <: GenHotPoint[]. From section 3 above, we know that this has the effect of re-typing the body of the function. All former references to are now replaced by , so the value parameter self : has been modified to self : . We may therefore supply an actual argument of this type, directly.

9 TYPE SOUNDNESS OF INHERITANCE

We now want to satisfy ourselves that the record combination expression with , which models the extension of an object by inheritance, is itself type sound. This expression is the whole body of the super-function, which we skipped over, above:

in which super is now bound, and refers to the super-object described in the previous section. Also, self refers to the self : introduced as the value parameter in the generator genAHotPoint. In order to understand whether the operator is being applied to values of suitable types, we need to simplify the left-hand and right-hand operands until they have the form of object records.

The base record

The left-hand operand to is super, and this is bound to the object genAPoint [] (self), which simplifies to a record, after have been supplied as arguments:

To see where this came from, refer back to the body of the typed object generator genAPoint, given in section 6 above. The only difference is that we have substituted {} and (self:} as a result of instantiating the generator. One interesting thing to notice is that, in the “inherited” version of equal, both self and the compared argument p are now of the child-type, .

The extension record

The right-hand operand to is a record of extra methods for a HotPoint, the fields contained in the braces {…}. The first is a redefinition of the equal method; the second is the new selected method (returning true for the instance we are defining). The body of equal contains a super-method invocation. We would like to satisfy ourselves that this equal is in fact equivalent to a regular method, by simplifying the super-method invocation.

The super-invocation has the form: super.equal(q). We know that q : from the immediately preceding declaration: ). Fortunately, the equal method selected from the body of the super-object (given above) expects an argument p : of exactly the same type. After substituting {q:} in the body, we obtain the simplified result:

Checking this out, we know already that self : , so we are comparing a q and a self which have the same type. Now, we substitute this into the body of the redefined equal, in place of the original super-invocation (which we have now simplified away altogether [4]), to yield the form of a regular record of methods:

in which self and q are uniform throughout the body of equal, referring to different child-instances, and are both of the same type, the child-type .

The record combination

The record combination expression, modelling the extension of an object by inheritance, has now been reduced to the form:

and it only remains to simplify . This was declared as a rather liberally-typed polymorphic operator [3], in the style of function override, accepting any two maps with the same domain-type, and yielding a map with this domain-type and a derived codomain-type that was the union of the arguments’ codomains:

In a later article, we will reconsider this type signature, to better constrain the “legitimate” types of record arguments supplied in inheritance expressions. For the moment, let us note that the domain-type will be bound to the type Label, from which we draw all the names of methods. The range-type will be bound to a union of all the method-signature types of the base record; likewise is a union of all the method-signature types of the extension record. This seems to “lump together” all the different types in each union. However, the definition of the operator explains how individual fields are overridden, so we may obtain the detail from this.

So, on the left-hand side, we have a base record with the (more detailed, record) type:

and on the right-hand side, we have an extension record with the type:

and after combining the base and extra records according to the definition of , we seem to obtain, experimentally speaking, a record with the type:

Looking at the pooled method types: {Integer, Boolean, Boolean} in the result, this does seem to be a union of {Integer, Boolean} Boolean, Boolean}, as the type-signature of declared. So, this is at least consistent, even if it is not yet very informative.

The result of record combination is therefore the body of an extended generator, suitable for a HotPoint instance, in which equal is re-typed in terms of the child’s self-type , and the extra method selected is included. It is as if we had defined genAHotPoint from first principles, without inheritance, with the (simplified) form:

so demonstrating that strongly-typed inheritance is just a short-hand for defining larger objects by extension, but with all the relevant type information attached.

10 CONCLUSION

We have presented a model of strongly-typed object generators, in which the class-type information is attached to the object-information. We may now formally claim to have defined the notion of class from both the implementation and type perspectives, combined. A class is, from a concrete perspective, a family of objects that share a similar (but overridable) implementation strategy and, from an abstract perspective, a family of types that share a similar (minimum common) method interface. This provides a good foundation for developing further model interpretations of other object-oriented concepts, such as class hierarchies, abstract classes and interfaces.

We also used the new typed calculus to present a model of strongly-typed inheritance. This combined two aspects of the sophisticated model of inheritance put forward previously in the Theory of Classification. Firstly, when a class inherits methods from its parent, object self-reference is redirected to refer to a child instance [3, 4]. Secondly, the type signatures of inherited methods adapt, such that methods that referred to a parent-type now refer to a child-type [1, 2]. This is important in languages like Smalltalk and Eiffel, where binary methods like equal or plus may evolve under inheritance, but always apply to objects of the appropriate specific type. Attaching the self-type parameter to any other variable in the model exactly explains the novel typing construction in Eiffel, where a variable is declared to have the type “like current”. It anchors the type of the variable to the type of self. This is an extremely satisfying way of providing types for binary methods, which expect to receive another object with the same type as self.

We also fulfilled a formal obligation to demonstrate that aspects of inheritance were type-sound. Super-method invocation was shown to be well-typed, yielding results as expected. We shall have to return to the typing of , in order to restrict the legitimate types of extra methods added to an object during inheritance, but our liberally-typed version of works as intended with the object implementations shown. Technically, the definition given for is an abbreviation of a more complete definition in the polymorphic typed -calculus, in which both type parameters and value parameters are supplied explicitly. We can think of our operator as a short-hand for expressing a type-instantiated version of the full-length combine function:

which is instantiated for each pair of record types A, B that we wish to combine.

Footnotes

¹ Readers following this series will recall that GenPoint[Point] is a fixpoint of the generator, ie that Point = GenPoint[Point], so we could equally say that the result is of the exact type Point.

² We restrict ourselves in this article to explaining the more sophisticated model of inheritance, in which self and the self-type evolve. This is, after all, the more interesting, relatively novel concept that needs explanation.

REFERENCES

[1] A J H Simons, “The theory of classification, part 7: A class is a type family”, in Journal of Object Technology, vol. 2, no. 3, May-June 2003, pp. 13-22. http://www.jot.fm/issues/issue_2003_05/column2

[2] A J H Simons, “The theory of classification, part 8: Classification and inheritance”, in Journal of Object Technology, vol. 2, no. 4, July-August 2003, pp. pp. 55-64. http://www.jot.fm/issues/issue_2003_07/column4

[3] A J H Simons, “The theory of classification, part 9: Inheritance and self-reference”, in Journal of Object Technology, vol. 2, no. 6, November-December 2003, pp. 25-34. http://www.jot.fm/issues/issue_2003_11/column2

[4] A J H Simons, “The theory of classification, part 10: Method combination and super-reference”, in Journal of Object Technology, vol. 3, no. 1, January-February 2004, pp. 43-53. http://www.jot.fm/issues/issue_2004_01/column4

[5] A J H Simons, “The theory of classification, part 2: The scratch-built typechecker”, in Journal of Object Technology, vol. 1, no. 2, July-August 2002, pp. 47-54. http://www.jot.fm/issues/issue_2002_07/column4

[6] P Canning, W Cook, W Hill, W Olthoff and J Mitchell, F-bounded polymorphism for object-oriented programming, Proc. 4th Int. Conf. Func. Prog. Lang. and Arch. (Imperial College, London, 1989), 273-280.

[7] W Cook, W Hill and P Canning, Inheritance is not subtyping, Proc. 17th ACM Symp. Principles of Prog. Lang., (ACM Sigplan, 1990), 125-135.

[8] A J H Simons, “The theory of classification, part 4: Object types and subtyping”, in Journal of Object Technology, vol. 1, no. 5, November-December 2002, pp. 27-35. http://www.jot.fm/issues/issue_2002_11/column2

About the author

Anthony Simons is a Senior Lecturer and Director of Teaching in the Department of Computer Science, University of Sheffield, where he leads object-oriented research in verification and testing, type theory and language design, development methods and precise notations. He can be reached at a.simons@dcs.shef.ac.uk.

Cite this column as follows: Anthony J.H. Simons: “The Theory of Classification, Part 11: Adding Class Types to Object Implementations”, in Journal of Object Technology, vol. 3, no. 3, March-April 2004, pp. 7-19. http://www.jot.fm/issues/issue_2004_03/column1

Next column