Previous paper

Next paper


Sub-Method Reflection

Marcus Denker, Stéphane Ducasse, Adrian Lienhard and Philippe Marschall, University of Berne, Switzerland


space REFEREED
PAPER


PDF Icon
PDF Version

Abstract

Reflection has proved to be a powerful feature to support the design of development environments and to extend languages. However, the granularity of structural reflection stops at the method level. This is a problem since without sub-method reflection developers have to duplicate efforts, for example to introduce transparently pluggable type-checkers or fine-grained profilers. In this paper we present Persephone, an efficient implementation of a sub-method meta-object protocol (MOP) based on AST annotations and dual methods (a compiled method and its meta-object) that reconcile AST expressiveness with bytecode execution. We validate the MOP by presenting TreeNurse, a method instrumentation framework and TypePlug, an optional, pluggable type system which is based on Persephone.


1 INTRODUCTION

Reflection [37] has proved to be an important property of long-lived and highlydynamic systems. The literature is full of examples of uses for reflection. For instance, message passing control has been used for a wide range of application analysis approaches, such as tracing [21,9], automatic construction of interaction diagrams, class affinity graphs, test coverage, as well as new debugging approaches [26]. Message passing control has also been used to introduce new language features in several languages, for instance multiple inheritance [6], distribution [2,31], instancebased programming [1], active objects [10], concurrent objects [42], futures [33] and atomic messages [17, 29], as well as backtracking facilities [25]. Nowadays, AOP is often using reflection to support its implementation [39, 38].

Current object-oriented languages provide access to program structure reifications. In most cases (e.g., Java or C#) only introspection is supported (i.e., these descriptions can be queried). In other languages, intercession is also supported (i.e., the program itself can be changed) [5]. Dynamic languages usually support both introspection and intercession (CLOS [23], Ruby, Python, Smalltalk [17,15]). In such languages, the representation of the program is causally connected to the system itself [27]. When the representation is changed, the running application changes, too and conversely.

The literature also distinguishes structural and behavioral reflection [23]: structural reflection deals with program structure while behavioral reflection deals with the execution. However, structural reflection often stops at the method level. For example in Smalltalk (the same holds in Java) the only entities supporting limited sub-method structural reflection are collection of bytes (from the compiled method) or characters (from the textual representation of method bodies).

The lack of a higher abstraction of method source code hampers the implementation of tools that require to reflect about it. Examples of such tools are code transformers (e.g., a refactoring tool), pluggable type-checkers, or fine-grained code coverage analysis tools. Since programming languages do not provide support for sub-method reflection each tool has to resort to implement its own infrastructure to represent and reason about source code.

However, this is difficult because (1) text is definitively not structured enough, (2) bytecode is low-level and (3) abstract syntax trees and text are not causally connected to the low-level representation of methods (i.e., changes are not reflected in the bytecode representation of methods).

Another problem is that it is hard to reuse existing infrastructure of one tool in another tool because solutions tend to be very specific. A common infrastructure would not only allow for code reuse but also facilitate data and cache sharing and provide a layer for communication between the tools.

In this paper we propose reflective methods which provide a high-level, extensible representation of method bodies accessible through a meta-object protocol (MOP). The high-level representation used by tools is causally connected to its low-level bytecode representation used by the virtual machine (VM). That is, changes to the high-level representation are automatically reflected at the bytecode level. While our implementation is done in Squeak [22], we believe that the concepts can be applied to other languages without severe changes.

The contribution of this paper is: a motivation for sub-method reflection, an analysis of the problems to support it, a MOP for sub-method reflection based on annotated abstract syntax trees and an efficient implementation named Persephone. We validate our approach on two non-trivial tools which are implemented based on Persephone.

This paper is organized as follows. In Section 2 we analyze the difficulties for supporting sub-method reflection. Section 3 presents our solution. After showing case studies (Section 4 and 5) we discuss benchmarks (Section 6) to validate the implementation. After a short overview of related work (Section 7) we conclude in Section 8.

2 CHALLENGES FOR SUPPORTING SUB-METHOD REFLECTION

To support the need of different tools such as a code browser, a code coverage tool or a refactoring engine, we need a representation that is extensible, it should allow tools to annotate the structural elements with metadata and the use of metadata for communication between tools. The representation should be persistent: it should not be needed to re-generate it for every tool. We need a high-level model of the method that supports powerful code transformation. Finally, the representation should be causally connected: changing the representation should change the system with no need of explicitly calling a compiler.

Most of the time, modern OO languages provide two representations for the method level: the text that the programmer typed, and bytecode, the language that the virtual machine interprets. Internally, tools that need sub-method structure do not use the text nor the bytecode directly, but generate a custom representation, in many cases an Abstract Syntax Tree (AST). We now evaluate the three representations as a foundation to support a sub-method MOP.

Text as Sub-Method Representation

In all current major programming languages, the programmer types text [14, 16]. This text is then used by the compiler to generate executable code. Text itself has no real structure – it is just a collection of characters. Thus text as a sub-method representation has the following problems:

Low-level. Text does not provide any high-level interface: it lacks the possibility to scope information and manipulate the underlying program elements.

Not causally connected. Changing the method body text has no effect. We need to call the compiler to generate a method.

Therefore text is almost never used directly for analysis and manipulation. Instead, tools parse the text into an intermediate format such as an AST.

AST as Sub-Method Representation

A commonly used and generated intermediate representation is the AST. For example, ASTs are used by the compiling chain and refactoring engine. Using the AST as sub-method representation has the following properties:

Not persistent. Representations such as ASTs are not persistent: they are generated and then used, but not stored. While trees can be created from source code, the meta-information, which we would like to associate to specific nodes has to be stored in separate structures.

Not causally connected. Changing an AST does not have an immediate effect on the underlying run-time behavior. Depending on the compiler API, the AST or the method body text is passed to the compiler, which will compile and install the method in its class.

Bytecode as Sub-Method Representation

In contrast to the other representations, bytecode is causally connected: changing bytecode directly changes the behavior of the system. Bytecode has been used, in both Java and Smalltalk, for a form of structural reflection (Javassist [11] and Byte-Surgeon [13])1. Both provide a high-level interface to the user, e.g., by abstracting away bytecode details such as the different encodings for message sending or providing a way to specify code to be inlined as a string in the host language. Nevertheless, in the end programmers are forced to deal with bytecode level abstractions which may be different than the programming language the methods are written in [8]. Bytecode as a model for sub-method structure has a number of problems:

Different representation needs. There is a dilemma: on the one hand the execution engine (bytecode VM) requires bytecode for execution, and on the other hand programming environments or programmers require text or an abstract and high-level representation of the source code.

Low level abstraction. The programmer has to deal with the idiosyncrasies of the bytecode representation. For example, control structures may be optimized as it is the case in Smalltalk bytecode. In addition, there is a mismatch between the program level abstractions and its runtime [8].

All three discussed representations have the problem of missing extensibility and lack a way to describe metadata:

Missing extensibility. Tools cannot easily extend the representation for their needs. Thus, they often use a custom representation which leads to a situation where tools cannot easily share meta-information, e.g., when one tool gathers information for others to use [41, 32].

Mixed base-level and meta-level code. Code transformation (e.g., bytecode instrumentation [11]) is often used to reflect on method execution: small snippets of code, so called hooks, are inserted into the original code. This leads to the problem of distinguishing code of the original method from the code added by the instrumentation framework.

Requirements

We want the best of both worlds: AST and bytecode representations. In summary, the representation should be (i) causally connected and well integrated in the system, (ii) persistent, (iii) extensible and (iv) reasonably compact with minimal performance impact. Furthermore, it should support the separation of base and annotated code and offer a high-level abstraction to the developers.

Such a model for reflective methods would make it easier to develop and deploy a new generation of tools that work on sub-method elements.

We have seen that bytecode is too low level for our purpose, the AST could be useful. But we need to take practical consideration into account: can such a high-level representation be reasonably compact and efficient? In the next section we present a solution that satisfies these constraints.

3 REFLECTIVE METHODS: ANNOTATED AND CACHED ASTS

Our solution is based on a dual representation of methods which combines an ASTbased representation offering high-level manipulations with a compact bytecodeoriented representation supporting fast execution. The abstract syntax trees can be annotated, the semantics of annotation is defined by specializing dedicated compiler plugins. The causal connection and efficient execution is ensured by dual methods that recompile bytecode automatically from their associated and annotated AST. In such a context, we define three meta-objects: (1) the ASTs and their associated transformation API, (2) the annotations and their semantic definition specified by (3) the plugins. This set of objects enables what we call a reflective method.

Figure 1: A reflective method is the meta-object of a compiled method.

Dual Methods

We implemented the presented approach in Squeak, an open-source Smalltalk. In Smalltalk, code is compiled to bytecode which resides in compiled method objects (an instance of the class CompiledMethod [19]). Besides bytecode, a compiled method keeps a pointer to its source and a dictionary for additional state that is associated with a method (e.g., its name). We enhanced the compiler to generate a reflective method instead of a compiled method. A reflective method provides access to its AST meta-object. Before its first execution, a compiled method is generated from the reflective method. Such a compiled method is cached to minimize performance loss. When the code of a method is changed the cache is flushed and the reflective method is reinstalled in the method dictionary as shown by Figure 1 and described in [28].

The system was named after Persephone,the greek goddess who spends half of her time in the underworld and the other half in the upper world. Like Persephone, methods are seen sometime as part the underworld (the virtual machine), other time as part of the upper world of high-level abstraction.

Before going into the details of our sub-method protocol we present an example of annotations that are visible in the source code. We show later that some annotations may be invisible in method source code.

A Simple Example: Compile-Time Evaluated Expressions

The following piece of code is the definition in Smalltalk of the method calculateNinePower which evaluates and returns the result of 9 to the power 10000. Without the annotation (<:evaluateAtCompiletime :>) the execution of such a method will at runtime send the message raisedTo: to the object 9 and then return 910000. Using annotations we can mark any abstract syntax tree element, i.e., any expression. Here we specify that the expression should be executed at compile-time.

Semantics Definition. At compile-time, the resulting value replaces the annotated expression. The semantics of the annotation is defined by creating a compiler plugin called CompiletimeEvaluator. The complete implementation is based on two classes which specialize the meta-objects and define an annotation and a compiler plugin (see Figure 2).

Figure 2: Extension for compile-time evaluation.

EvaluateAtCompiletimeAnnotation is a subclass of NoValueAnnotation since this annotation does not expect arguments. The class implements one method: key that returns the annotation name (here the symbol evaluteAtCompiletime).

The compiler plugin class, named CompiletimeEvaluator, is a subclass of Compiler-Plugin. Besides that, it only has two small methods visitNode: and evaluateNow:.

We check every node for the evaluteAtCompiletime annotation. If the annotation is present, we pass the node to evaluateNow: which does the evaluation. If the annotations is not set, we just continue the visiting process. The method evaluateNow: evaluates an expression by sending the evaluate message, a literal node is created to hold the result. It replaces the original node and the visiting process continues.

AST and Tree Transformation API

The AST used in Persephone is an extended version of the Smalltalk refactoring engine AST [35]. We extended it to provide annotations for all node objects. Reflective methods provide a comprehensive abstract syntax tree API. Trees are easily edited and transformed (added/removed/replaced, see Table 1). On top of these simple transformations, the Parse-TreeRewriter is a rewrite engine which allows transformations to be specified at a high-level of abstraction.

Table 1: The transformation API for nodes.

It should be noted that this API provides a way to destructively transform a tree: after the transformation, the tree is changed in the same way as if we would have edited text and recompiled the method. This is useful e.g., for refactorings. However, destructively changing a tree is not always what we need. As we will present in Section 4, annotations provide a way to define non destructive transformations i.e., we can always identify the original method.

AST Annotations

As discussed in Section 2, reflective methods should serve as the main method representation for many different usages. A consequence is that users need to be able to add information about objects (nodes) directly to those objects themselves. Adding behavior to the node classes is possible using the class extension mechanism of Smalltalk which allows a package to define methods on classes defined in another package [3]. For object state extensions, our MOP provides annotation objects that are attached to any node as shown in Figure 3.

Figure 3: The annotation hierarchy.

Each annotation is uniquely identified by a key and each node has a dictionary that maps keys to annotations. Annotations are instances of one of the three subclasses of Annotation. They can have zero, one or multiple values. To define a custom annotation we subclass from the appropriate subclass depending on their number of values (see Figure 3). The difference between multi-valued and single valued annotations is that the former can be defined multiple times on the same expression with different values whereas the latter cannot.

The expression anExpression <:aSelector: anArgument:> attaches an annotation with one argument to the expression anExpression. Annotations are supported on all expressions and additionally on method arguments, block arguments and each variable name in a temporary variable definition.

The argument of an annotation can be any Smalltalk expression. In the simplest case, when the argument is just a literal object, the value of the annotation is set to this literal object. When the argument is an expression, the value of the annotation is the AST of this expression. We can specify when this AST is evaluated, either at compile time or later at run-time. In addition we provide a reflective interface for annotations which can be queried and set at runtime.

Annotations may or may not appear in the source code. For example, an invisible annotation is the number of times a particular tree element has been executed; an example for a visible annotation is a type declaration. Non-textual annotations can be added reflectively to any node at runtime. These annotations are kept as long as the AST is not regenerated. For example, when a method is recompiled from source, the nodes of this method will have no annotations, but clients can be notified of any code change and add annotations again if needed.

Annotation Semantics

Without specific interpretation, annotations are pure metadata: they have no predefined semantics. To specify annotation semantics, Persephone defines a meta-object protocol for the compiler and bytecode generator.

The MOP is based on a plugin architecture. Before generating any code, the compiler framework copies the AST and then calls all defined compiler plugins by priority order. A compiler plugin is just a subclass of CompilerPlugin. Plugins affect compilation by transforming the AST. As we provide fully reflective access to the annotations, the compiler plugin may take annotations into account. We present a full example in Section 4 with the instrumentation framework TreeNurse.

Characteristics of the Solution

Now we analyze our approach according to the requirements presented in Section 2.

Causal connection. The implementation ensures that the executed bytecode is always in sync with the reflective method. The whole mechanism is completely transparent to the user. Thus, we provide a causally connected and integrated model.

Persistency. Reflective methods (the AST including the annotations) are installed in the method dictionary of a class. As reflective methods are normal Smalltalk objects, they are written to the disk when the system is stopped. Thus the model is persistent.

Abstraction level. We reuse the same representation and API as the Smalltalk refactoring engine which has proven to be a usable abstraction for various analyses such as refactoring and general meta-programming.

Separation of base and meta-level code. The annotation framework provides a way to structure code into data (AST) and metadata (annotations). For example, instrumentations as the results of the meta-programs are still completely identifiable since they are represented in the AST as annotations.

Extensibility. Annotations provide a convenient way to associate metadata with the AST nodes for storing additional state. Due to the compiler MOP, we can use annotations to extend the language semantics.

Size and performance We will discuss memory consumption in Section 6. The provided caching scheme ensures that at runtime we generate a compiled method only once. In situations where even this relatively low overhead is too large, we can statically generate all bytecode before deployment.

We now validate our approach by presenting two tools that we have built based on Persephone. The first tool is a framework for instrumenting code (TreeNurse). Tree-Nurse is very similar ByteSurgeon [13] or Javassist [11]. Using TreeNurse, we show how the extensibility of our approach is used for communication between tools: we implement a statement coverage tool that records the coverage information as annotations on the reflective method. The second example presents TypePlug, an optional type system that uses textual annotations.

4 VALIDATION: INSTRUMENTATION USING ANNOTATIONS

TreeNurse is a framework for instrumenting code. It is modeled to have an interface very similar to ByteSurgeon, a bytecode transformation framework [13]. In contrast to ByteSurgeon, it works on the AST and is implemented as a Persephone compiler-plugin.

To present TreeNurse, we start with discussing a simple example. We have a method that does an assignment and we want to annotate the code with trace statements for debugging that increases a counter. The original code and what we want to actually do at runtime are shown in Figure 4.

Figure 4: Code instrumented with a counter.

Here it is important to understand that we want to change the semantics of the method, but we don’t want to change the code itself. The counter is not part of the design of the system, it is just a temporal addition for debugging. If we transform the code with the help of the refactoring API of the AST, the result would be a new method where the debugging code would be undistinguishable from the original code. Bytecode instrumentation frameworks such as ByteSurgeon or Javassist have the same problem: once the bytecode is transformed, we do not know which statements are part of the original method and which have been added, except at the price of tedious bookkeeping.

An instrumentation framework built on top of Persephone does better: we use annotations to store the instrumentations in the reflective method and they are taken into account when generating code, as shown in Figure 5.

The actual code to do this annotation with our framework looks like this:

This code adds an after annotation to all assignments in a method with the effect of incrementing the counter. We instrument the method by sending instrument: passing a

Figure 5: Instrumenting a reflective method with annotations.

block with the instrumentation code. We only want to instrument assignments so we select them by sending isAssignment to each node passed to the block.

By implementing TreeNurse using reflective methods, TreeNurse has the following unique properties:

  • It works on AST nodes and not binary code.
  • The instrumentation is stored as an annotation on the AST node. The original AST is left untouched.
  • The generation of bytecode is done lazily, on need. Instrumentation merely results in the bytecode cache to be reset. This can lead to considerable time savings especially for large programs.

The Instrumentation API

On all the AST nodes, we support the following transformation API:

For an easy selection of the nodes to be instrumented, we provide dedicated iteration
interface, for example instrumentAssignments: iterates over all assignment nodes.
Meta Variables
The only variables that can be directly referenced from inside the block are self, super and
thisContext, as the code of the block is to be inlined into the instrumented method. Thus,
these variables will be bound to their values at runtime, not at instrumentation time. For everything else, block arguments have to be used. Each kind of node provides a different
set of meta variables that can be used as block arguments:

The following code gives an example of a replace annotation, which uses two meta variables, one referencing the variable and the other the value of the assignment:

TreeNurse replaces all assignments with a new assignment that adds the number 3000 to the value expression of the original code.

Example: Code Coverage Analysis

Code coverage analysis per expression is a conceptually simple task. Before an expression gets executed it is marked as executed. After the program is run the executed expressions are printed differently from the ones that were not executed. The most convenient way to store information about a node is by adding an annotation to it that holds the information. To keep track of how many times a node has been executed we create a subclass of SingleValuedAnnotation named ExecutedAnnotation. We then add a method markExecuted to ProgramNode via a class extension:

If we now send markExecuted to any node, its execution count will be incremented, which we can do using our instrumentation framework:

Here we iterate over all the nodes of the method, the block is evaluated for each node, binding the node to the variable eachNode. Before each node, we insert code. The inserted code is described by a block. This block references the meta variable node which references the AST node at runtime.

Here we see a usage of the node meta variable. It reifies the node, so we can call the markExecuted directly on the AST nodes. To produce the final output a pretty printer presents the status of execution.

5 VALIDATION: PLUGGABLE TYPE SYSTEM

TypePlug [20] is an optional, pluggable type system [7] for Squeak. It consists of a type reconstructor and inferencer that is used by a type checker to check Squeak programs for type correctness.

A type checker is an example for a program that works on code: we need to be able to attach metadata to expressions (the types) and be able to reason about this metadata. Thus the realization of a type system validates especially the extensibility of the reflective methods. We need to be able to model metadata that describes the type of any expression in the code.

With Persephone, types are represented as annotations on nodes in the AST. They can be declared on method and block arguments, method and block return values and temporary variable declarations. There are two different ways to declare types.

  1. Using a special code browser to annotate the nodes. This has the advantage that types are declared without changing the source code but can still be checked into a source code management system. This is the preferred way for typing existing code especially system classes like Boolean or Collection.
  2. Textual annotations, which are placed in the source code. The following code shows a method annotated with types:

The method takes a boolean as an argument and returns an integer.

The TypePlug case study shows that our representation provides the extensibility needed for extending the language with a pluggable type system. TypePlug demonstrates the usefulness of providing textual annotations where the annotation itself is not limited to be a static predefined datatype. Persephone supports annotations that contain general Smalltalk code. The evaluation of this code can be completely controlled by the annotation class itself or the compiler plugin. This allows for building complex annotations as required e.g., for advanced type systems.

6 PERFORMANCE AND MEMORY ANALYSIS

In this section we discuss results of performance benchmarks and memory consumption. The machine used is an Apple MacBook Pro (2Ghz Intel Core Duo, 2GB RAM). We present an analysis of the effect of our caching scheme and discuss memory consumption.

Performance of Cache

To analyze cache performance, we use the TinyBenchmark suite that is part of the normal Squeak distribution (Table 2). The TinyBenchmark suite tests bytecode interpretation and message send performance. For this test, we use the runtime of the benchmarks for assessing cache performance. First, we record the runtime for an unmodified Squeak. Then we run the benchmark with Persephone in two cases: with and without caching the generated bytecode.

When running the benchmark with caching disabled, the system gets too slow to be usable as bytecode needs to be generated for each method execution. We had to abort the benchmark run after one hour. We see a noticeable speedup as soon as we turn on caching: Persephone shows no detectable slowdown compared to standard Squeak, even though bytecode had to be generated for the benchmark methods on the first execution.

Table 2: The effect of method caching.

Thus we can see that the cache provides a substantial speedup and enables a system using reflective method to be as fast as the standard Squeak system.

Memory Considerations

Representing a method as an AST in which each node is an object obviously requires a lot more memory than its corresponding compiled method which only stores the bytecodes. Table 3 shows the memory consumption of the AST of a complete Squeak system.

Table 3: Memory consumption.

We see that the system uses a lot of memory, but in a typical development scenario, the system is already usable as is without any further optimization. In addition to that, reflective methods are mostly only reified for parts of the system, for example a single package that needs to be analyzed.

To assess if the size is practically usable, we compare to the size of code loaded into Eclipse, a development environment used widely in industry. We took the source of ArgoUML2 and loaded it in Eclipse version 3.2. ArgoUML Version 0.24 consists of ca. 1300 classes, it is thus much smaller then Squeak. Eclipse allocates ca. 180MB of memory when having the ArgoUML source loaded. Thus the memory consumption we see for Persephone is typical for a modern development environment.

It is also to note that those figures have to be considered as upper bounds since we have not yet fully applied some memory related optimizations. For example, the size of the AST data can be optimized further by not referencing scanner-token data. In addition to that, we plan to experiment with AST specific compression techniques [18].

7 RELATED WORK

Mirrors

Bracha and Ungar [8] suggest structuring the meta layer with the help of mirrors. In this approach, objects are not reflective, they need a mirror that provides the reflective API. They mention mirrors on methods as future work. Mirror methods and our reflective methods share some similarities, but as mirrors are generated on demand they are not persistent. They provide a high-level view on low-level objects. We instead provide the high level view by default.

Annotations

Annotations are not new. For example, Javadoc tags are a form of annotation. More recently, Java 1.5 adds support for annotations to the language. As of release 5.0, Java has a general purpose annotation (also known as metadata) facility that permits you to define and use your own annotation types. The facility consists of a syntax for declaring annotation types, a syntax for annotating declarations, APIs for reading annotations, a class file representation for annotations, and an annotation processing tool. Java 1.5 annotations are only allowed for type, field, variable and method declarations and are not allowed on type parameters or method invocations. VisualWorks and Squeak also support method annotations (also called pragmas).

Spoon [34] is an open compiler for Java that provides compile-time reflection. With Spoon, the Java AST can be transformed before it is compiled to bytecode. The processors of Spoon are similar to the compiler-plugins of Persephone. Spoon provides support for Java annotations, i.e., the transformation processors can read annotations. The main difference to our work is that Spoon works at compile-time, not runtime. The AST is compiled to bytecode and not available at runtime. Spoon provides an annotation-aware compile-time transformation framework, but not causally connected structural reflection at runtime. In addition to that, it is restricted to the annotation model provided by Java.

Higher Level Abstraction: Beyond Text

There have been a number of proposals over the years to move away from text as the only representation of code. Dimitriev [14] argues that programs should no longer be text but a graph described with a metamodel built for a certain kind of problem. The language would be mapped to another one for execution or interpretation. Edwards [16] argues that programs should no longer be text and the representation of a program should be the same as its execution. His programs are trees created by copying. He also identifies the need to customize the presentation of a program. Black [4] makes a case to free programs from their linear structure and replace them with a much richer abstract program structure (APS) that captures all of the semantics, but is independent of any syntax. Conventional one and two dimensional syntax, abstract syntax trees, class diagrams, and other common representations of a program are all different "views” on this rich abstraction. None of these proposals discuss the use of the high-level representation for reflection.

Sub-method Structure

In LISP [30] source code is itself made up of lists. As a result macros can manipulate it using the list-processing functions available in the language. This functionality is limited to macros at compile time and cannot be applied to functions at runtime.

In IO [12] code is a runtime inspectable and modifiable tree. Message arguments are passed as expressions and evaluated by the receiver. Selective evaluation of arguments can be used to implement control flow. IO does not yet provide a way to extend the representation.

Sub-Method Behavioral Reflection and AOP

Behavioral Reflection is concerned with the execution of programs [37]. The elements reasoned about can be those of sub-method abstraction, like message sends and instance variable accesses. Reflection frameworks like Reflex [40] or Geppetto [36] provide a static model for sub-method structure, modeled as a sequence of operations. Thus they provide less complete structural model then that of the AST. For implementation, they rely on bytecode instrumentation.

In AOP [24], a joinpoint describes a point in the the execution of a program. Thus AOP deals with an execution model, it does not provide a general, static model of code. The structural model provided by Persephone in connection with partial behavioral reflection would be an interesting basis for implementing an AOP system.

8 CONCLUSION AND FUTURE WORK

In this paper we have motivated the need for reflective methods and presented an implementation called Persephone. We validated our approach by implementing an instrumentation framework and a pluggable type system.

Persephone has been used very successfully in a number of projects. The TreeNurse instrumentation framework has been used to implement a test coverage tool that provides line-by-line coverage analysis. Additionally, the instrumentation framework is the basis of an ongoing experiment that analyzes how objects flow through the system. For this, TreeNurse is used to instrument all variable assignments and accesses and message sends.

We plan to explore the use of annotations in the context of behavioral reflection. Geppetto [36] has been implemented using bytecode instrumentation. We are evaluating how Geppetto can use annotations on the AST instead of hooks at the bytecode level.

Compactness of the representation is an interesting field for future work. For our experiments, memory consumption has never been a problem, but nevertheless, we plan to research how to improve storage by optimizing the AST representation leveraging transparent AST compression.

Another interesting direction is to use reflective methods to replace text: source code could be reconstructed from the AST. The current version provides some support for recovering formatting, but it needs to be improved and integrated with the tools.

ACKNOWLEDGEMENTS

We gratefully acknowledge the financial support of the Swiss National Science Foundation for the project "Analyzing, capturing and taming software change" (SNF Project No. 200020-113342, Oct. 2006 - Sept. 2008). We would also like to express our thanks to Nik Haldimann for TypePlug and Oscar Nierstrasz and Frédéric Pluquet and Roel Wuyts for their help in reviewing various drafts of this paper.

FOOTNOTES

1 In the case of Javassist for Java it should be noted that it can only provide load-time structural reflection: newly loaded code can be transformed, but once it is loaded, no further change is possible In contrast to that, ByteSurgeon provides full runtime transformation capabilities: methods can be transformed even in a running system.

2 http://argouml.tigris.org/


REFERENCES

[1] Kent Beck. Instance specific behavior: Digitalk implementation and the deep meaning of it all. Smalltalk Report, 2(7), May 1993.

[2] John K. Bennett. The design and implementation of distributed Smalltalk. In Proceedings OOPSLA ’87, ACM SIGPLAN Notices, volume 22, pages 318–330, December
1987.

[3] Alexandre Bergel, Stéphane Ducasse, Oscar Nierstrasz, and Roel Wuyts. Classboxes: Controlling visibility of class extensions. Computer Languages, Systems and Structures, 31(3-4):107–126, December 2005.

[4] Andrew P. Black and Mark P. Jones. Perspectives on software. In OOPSLA 2000 Workshop on Advanced Separation of Concerns in Object-oriented Systems, 2000.

[5] D.G. Bobrow, R.P. Gabriel, and J.L. White. CLOS in context — the shape of the design. In A. Paepcke, editor, Object-Oriented Programming: the CLOS perspective, pages 29–61. MIT Press, 1993.

[6] Alan H. Borning and Daniel H.H. Ingalls. Multiple inheritance in Smalltalk-80. In Proceedings at the National Conference on AI, pages 234–237, Pittsburgh, PA, 1982.

[7] Gilad Bracha. Pluggable type systems, October 2004. OOPSLA Workshop on Revival of Dynamic Languages.

[8] Gilad Bracha and David Ungar. Mirrors: design principles for meta-level facilities of object-oriented programming languages. In Proceedings of OOPSLA ’04, ACM SIGPLAN Notices, pages 331–344, New York, NY, USA, 2004. ACM Press.

[9] John Brant, Brian Foote, Ralph Johnson, and Don Roberts. Wrappers to the rescue. In Proceedings European Conference on Object Oriented Programming (ECOOP 1998), volume 1445 of LNCS, pages 396–417. Springer-Verlag, 1998.

[10] Jean-Pierre Briot. Actalk: A testbed for classifying and designing actor languages in the Smalltalk-80 environment. In S. Cook, editor, Proceedings ECOOP ’89, pages 109–129, Nottingham, July 1989. Cambridge University Press.

[11] S. Chiba and M. Nishizawa. An easy-to-use toolkit for efficient Java bytecode translators. In Proceedings of GPCE’03, volume 2830 of LNCS, pages 364–376, 2003.

[12] Steve Dekorte. Io: a small programming language. In Ralph Johnson and Richard P. Gabriel, editors, Companion to the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2005, October 16-20, 2004, San Diego, CA, USA, pages 166–167. ACM, 2005.

[13] Marcus Denker, Stéphane Ducasse, and Éric Tanter. Runtime bytecode transformation for Smalltalk. Journal of Computer Languages, Systems and Structures, 32(2-3):125–139, July 2006.

[14] Sergey Dimitriev. Language oriented programming: The next programming paradigm. onBoard Online Magazine, 1(1), November 2004.

[15] Stéphane Ducasse. Evaluating message passing control techniques in Smalltalk. Journal of Object-Oriented Programming (JOOP), 12(6):39–44, June 1999.

[16] Jonathan Edwards. Subtext: uncovering the simplicity of programming. In Ralph Johnson and Richard P. Gabriel, editors, Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2005, October 16-20, 2004, San Diego, CA, USA, pages 505–518. ACM, 2005.

[17] Brian Foote and Ralph E. Johnson. Reflective facilities in Smalltalk-80. In Proceedings OOPSLA ’89, ACM SIGPLAN Notices, volume 24, pages 327–336, October 1989.

[18] Michael Franz and Thomas Kistler. Slim binaries. Commun. ACM, 40(12):87–94, 1997.

[19] Adele Goldberg and Dave Robson. Smalltalk-80: The Language. Addison Wesley, 1989.

[20] Niklaus Haldimann. Typeplug — pluggable type systems for smalltalk. Master’s thesis, University of Bern, April 2007.

[21] Jurgen Herczeg Heinz-Dieter Bocker. What tracers are made of. In Proceedings of OOPSLA/ECOOP ’90, pages 89–99, October 1990.

[22] Dan Ingalls, Ted Kaehler, John Maloney, Scott Wallace, and Alan Kay. Back to the future: The story of Squeak, A practical Smalltalk written in itself. In Proceedings OOPSLA ’97, ACM SIGPLAN Notices, pages 318–326. ACM Press, November 1997.

[23] Gregor Kiczales, Jim des Rivières, and Daniel G. Bobrow. The Art of the Metaobject Protocol. MIT Press, 1991.

[24] Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Lopes, Jean-Marc Loingtier, and John Irwin. Aspect-Oriented Programming. In Mehmet Aksit and Satoshi Matsuoka, editors, Proceedings ECOOP ’97, volume 1241 of LNCS, pages 220–242, Jyvaskyla, Finland, June 1997. Springer-Verlag.

[25] Wilf R. LaLonde and Mark Van Gulik. Building a backtracking facility in Smalltalk without kernel support. In Proceedings OOPSLA ’88, ACM SIGPLAN Notices, volume 23, pages 105–122, November 1988.

[26] Bill Lewis and Mireille Ducassé. Using events to debug Java programs backwards in time. In OOPSLA Companion 2003, pages 96–97, 2003.

[27] Pattie Maes. Concepts and experiments in computational reflection. In Proceedings OOPSLA ’87, ACM SIGPLAN Notices, volume 22, pages 147–155, December 1987.

[28] Philippe Marschall. Persephone: Taking Smalltalk reflection to the sub-method level. Master’s thesis, University of Bern, December 2006.

[29] Jeff McAffer. Meta-level programming with coda. In W. Olthoff, editor, Proceedings ECOOP ’95, volume 952 of LNCS, pages 190–214, Aarhus, Denmark, August 1995. Springer-Verlag.

[30] J. McCarthy. Recursive functions of symbolic expressions and their computation by machine, part I. CACM, 3(4):184–195, April 1960.

[31] Paul L. McCullough. Transparent forwarding: First steps. In Proceedings OOPSLA’87, ACM SIGPLAN Notices, volume 22, pages 331–341, December 1987.

[32] Scott Meyers. Difficulties in integrating multiview development systems. IEEE Softw., 8(1):49–57, 1991.

[33] Geoffrey A. Pascoe. Encapsulators: A new software paradigm in Smalltalk-80. In Proceedings OOPSLA ’86, ACM SIGPLAN Notices, volume 21, pages 341–346, November 1986.

[34] Renaud Pawlak. Spoon: annotation-driven program transformation — the aop case. In AOMD ’05: Proceedings of the 1st workshop on Aspect oriented middleware development, New York, NY, USA, 2005. ACM Press.

[35] Don Roberts, John Brant, Ralph E. Johnson, and Bill Opdyke. An automated refactoring tool. In Proceedings of ICAST ’96, Chicago, IL, April 1996.

[36] David Röthlisberger, Marcus Denker, and Éric Tanter. Unanticipated partial behavioral reflection. In Advances in Smalltalk — Proceedings of 14th International Smalltalk Conference (ISC 2006), volume 4406 of LNCS, pages 47–65. Springer, 2007.

[37] Brian Cantwell Smith. Reflection and semantics in a procedural language. Technical Report TR-272, MIT, Cambridge, MA, 1982.

[38] Éric Tanter. An extensible kernel language for AOP. In Proceedings of AOSD Workshop on Open and Dynamic Aspect Languages, Bonn, Germany, 2006.

[39] Éric Tanter and Jacques Noyé. A versatile kernel for multi-language AOP. In Proceedings of the 4th ACM SIGPLAN/SIGSOFT Conference on Generative Programming and Component Engineering (GPCE 2005), volume 3676 of LNCS, Tallin, Estonia, sep 2005.

[40] Éric Tanter, Jacques Noyé, Denis Caromel, and Pierre Cointe. Partial behavioral reflection: Spatial and temporal selection of reification. In Proceedings of OOPSLA’03, ACM SIGPLAN Notices, pages 27–46, nov 2003.

[41] Daniel Vainsencher and Andrew P. Black. A pattern language for extensible program representation. In Proceedings of PLoP 2006, 2006.

[42] Yasuhiko Yokote and Mario Tokoro. Experience and evolution of ConcurrentSmalltalk. In Proceedings OOPSLA ’87, ACM SIGPLAN Notices, volume 22, pages 406–415, December 1987.

About the authors



  Marcus Denker is a PhD student at the Software Composition Group of the University of Bern. His research focuses on reflection and metaprogramming for dynamic languages. Marcus Denker is an active participant in the Squeak open source community for many years. He received a Dipl.-Inform. (MSc) from the University of Karlsruhe/Germany. Contact him at denker@iam.unibe.ch.


  Stéphane Ducasse is a Research Director at INRIA Futurs. He was full Professor at Universit de Savoie, where he led the Language and Software Evolution Group. Before that he co-directed with O. Nierstrasz the Software Composition Group of University of Bern. He is the president of the European Smalltalk User Group and has a lof of fun with Smalltalk. He is involved in the development of Squeak the open-source Smalltalk. His research interests include dynamic languages, reflective systems, reengineering and maintenance, program visualization, and metamodeling. He hold a PhD from the Universit de Nice-Sophia Antipolis in Computer Sciences. You can contact him at stephane.ducasse@inria.fr.


  Adrian Lienhard is a doctoral candidate in computer science at the University of Bern and co-founder of netstyle.ch, a startup company specialized in business Web application development. His research interests include reengineering of object-oriented systems, dynamic analysis, dynamic languages, language design, program visualization and Web applications. He received his MSc in computer science from the University of Bern. Contact him at lienhard@iam.unibe.ch.
 

Philippe Marschall is Software Engineer at Netcetera, a swiss IT services company. He received his MSc in computer science from the University of Bern. Contact him at philippe.marschall@gmail.com.


Cite this document as follows: Marcus Denker, Stéphane Ducasse, Andrian Lienhard, Philippe Marschall: "Sub-Method Reflection", in Journal of Object Technology, Special Issue TOOLS
Europe 2007, October 2007, vol. 6, no. 9, pages 231-251, http://www.jot.fm/issues/issue_2007_10/paper14/


Previous paper

Next paper