1 INTRODUCTION This is the second article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. Eventually, we aim to explain the behaviour of languages such as Smalltalk, C++, Eiffel and Java in a consistent framework, modelling features such as classes, inheritance, polymorphism, message passing, method combination and templates or generic parameters. Along the way, we shall look at some important theoretical approaches, such as subtyping, F-bounds, matching and the object calculus. Our goal is to find a mathematical model that can describe the features of these languages; and a proof technique that will let us reason about the model. This will be the "Theory of Classification" of the series title. Figure 1: Dimensions of Type Checking The first article [1] introduced the notion of type, ranging from the programmer's concrete perspective to the mathematician's abstract perspective, pointing out the benefits of abstraction and precision. From these, let us choose three levels of type-checking to consider: representation-checking (bit-schemas), interface-checking (signatures) and behaviour-checking (algebras). Component compatibility was judged according to whether exact type correspondence, simple subtyping or the more ambitious subclassing was expected. Combining these two dimensions, there are up to nine combinations to consider, as illustrated in figure 1. However, we shall be mostly interested in the darker shaded areas. In this second article, we shall build a typechecker that can determine the exact syntactic type (box 2, in figure 1) of expressions involving objects encoded as simple records. 2 THE UNIVERSE OF VALUES AND TYPES Rather like scratch-building a model sailing ship out of matchsticks, all mathematical model-building approaches start from first principles. To help get off the ground, most make some basic assumptions about the universe of values. Primitive sets, such as Boolean, Natural and Integer are assumed to exist (although we could go back further and construct them from first principles, in the same way as we did the Ordinal type [1]; this is quite a fascinating exercise in the *-calculus [2]). All other kinds of concept have to be defined using rules to say how the concept is formed, and how it is used. We shall assume that there are:
3 RULES FOR PAIRS An immediately useful construction which we do not yet have is the notion of a pair of values, This rule is expressed in the usual style of natural deduction, with the premises above the line and the conclusions below. In longhand, it says "if n is of type N and m is of type M, then we may conclude that a pair For pair constructions to be useful, we need to know how to access the elements of a pair, and determine their types. We define the first and second projections of a pair, usually written in the style: "If e is a pair of the product type N x M, then the first projection 4 RULES FOR FUNCTIONS Consider an infinite set of pairs: "If variable x has the type D and, as a consequence, expression e has the type C, we may conclude that a function of x with body e has the function type D The function elimination rule explains the type of an expression involving a function application. In so doing, it also defines the parenthesis syntax for function application: "If f is a function from D Do we need rules for multi-argument functions? Not really, because we already have the separate product rules. The domain D in the function rules could correspond, if we so wished, to a type that was a product: D 5 RULES FOR RECORDS Most model encodings for objects [4, 5, 6] treat them as some kind of record with a number of labelled fields, each storing a differently-typed value. So far, we do not have a construction for records in our model. However, consider that a record is rather like a finite set of pairs, relating labels to values: "If there are n distinct labels The corresponding record elimination rule introduces the dot or record selection operator, defining how to deconstruct a record to select a field and then determine its type: "If e has the type of a record, with n labels 6 APPLYING THE RULES We have a set of rules for constructing pairs, functions and records. With this, we can model simple objects as records. Ignoring the issue of encapsulation for the moment, a simple Cartesian point object may be modelled as a record whose field labels map to simple values (attributes) and to functions (methods). We require a function for constructing points:
This is a type declaration, stating that make-point is a function that accepts a pair of Integers and returns a Point type (which is so far undefined). The full definition of make-point: names the argument expression e supplied upon creation and returns a record having the fields x, y and equal. The x and y fields map to simple values, projections of e; the equal field maps to a function, representing a method. Note that make-point is built up in stages according to the type rules. The product introduction rule can construct a pair type: Integer x Integer from primitive Integers. The function type of equal: Point The rules permit us to deduce that objects can be constructed using make-point, and also that they are well-typed. Let us now construct some points and expressions involving points, to see if these are well-typed. The let ... in syntax is a way of introducing a scope for the values p1 and p2, in which the following expressions are evaluated:
Is p1.x meaningful, and does it have a type? The record elimination rule says this is so, provided p1 is an instance of a suitable record type. Working backwards, p1 must be a record with at least the type: {... x : X ... } for some type X. Working forwards, p1 was constructed using make-point, so we know it has the Point type, which, when expanded, is equivalent to the record type: { x : Integer, y : Integer, equal : Point * Boolean }, which also has a field x : Integer. Matching up the two, we can deduce that p1.x : Integer.
Is p1.equal(p2) meaningful, and does it have a type? Again, by working backwards through the record elimination rule, we infer that p1 must have at least the type {... equal : Y ...} for some type Y. Working forwards, we see that p1 has a field equal : Point 7 CONCLUSION We constructed a mathematical model for simple objects from first principles, in order to show how it is possible to motivate the existence of something as relatively sophisticated as an object with a (constant) state and methods, using only the most primitive elements of set theory and Boolean logic as a starting point. The type rules presented were of two kinds: introduction rules describe how more complex constructions, such as functions and records, are formed and under what conditions they are well-typed; elimination rules describe how these constructions may be decomposed into their simpler elements, and what types these parts have. Both kinds of rule were used in a typechecker, which was able to determine the syntactic correctness of method invocations. The formal style of reasoning, chaining both forwards and backwards through the ruleset, was illustrated. The simple model still has a number of drawbacks: there is no updating or encapsulation of state; there are problems looming to do with recursive definitions; and we ignored the context (scope) in which the definitions take effect. In the next article, we shall examine some different object encodings that address some of these issues. Footnote 1Hellenophobe: a hater of Greek symbols REFERENCES [1] A J H Simons, Perspectives on type compatibility, Journal of Object Technology 1(1), May, 2002. [2] A J H Simons, Appendix 1 : *-Calculus, in: A Language with Class: the Theory of Classification Exemplified in an Object-Oriented Language, PhD Thesis, University of Sheffield, 1995, 220-238. See http://www.dcs.shef.ac.uk/~ajhs/classify/. [3] L Cardelli and P Wegner, On understanding types, data abstraction and polymorphism, ACM Computing Surveys, 17(4), 1985, 471-521. [4] J C Reynolds, User defined types and procedural data structures as complementary approaches to data abstraction, in: Programming Methodology: a Collection of Articles by IFIP WG2.3, ed. D Gries, 1975, 309-317; reprinted from New Advances in Algorithmic Languages, ed. S A Schuman, INRIA, 1975, 157-168. [5] W Cook, Object-oriented programming versus abstract data types, in: Foundations of Object-Oriented Languages, LNCS 489, eds. J de Bakker et al., Springer Verlag, 1991, 151-178. [6] M Abadi and L Cardelli. A Theory of Objects. Monographs in Computer Science, Springer-Verlag, 1996. About the author
Cite this column as follows: Anthony J. H. Simons: "The Theory of Classification, Part 2: The Scratch-Built Typechecker", in Journal of Object Technology, vol. 1, no. 2, July-August 2002, pp. 47-54. http://www.jot.fm/issues/issue_2002_07/column4 |