Previous article

Next article


Constructing persistent object-oriented models with standard C++

Alexander Kozynchenko, Department of Information Technology and Media, Mid Sweden University, Sundsvall, Sweden

space REFEREED
ARTICLE


PDF Icon
PDF Version

Abstract

In this paper, it is suggested an approach and a design pattern for developing object-oriented models that need to be persistent, including the databases of moderate size, with using only the standard C++ and its file storage facilities, and without using specific C++ dialects or any support of external libraries providing the persistence. Objects of the model may be of a great variety of types, belonging to a complex class hierarchy, and are considered to be of rather general structure, containing both pointers to any other model’s objects and dynamically allocated arrays of various types. The main idea consists in that all types involved are considered as classes derived from the unique base class with the minimal common interface. Classes’ objects are allocated dynamically, and the pointers are kept in the model’s base-class pointers container, which provides sorting, searching, and changing the objects kept. The objects’ serialization, reading, and management is implemented with using the virtual functions, list of type names, and object factory technique.


1 INTRODUCTION

In the paper, it is treated a problem of developing the object-oriented models with objects dynamically allocated on the free store and having been linked to each other with pointers. Object-oriented databases and knowledge-based systems are considered to be an important kind of such models. Once created, the model may be running some time, change, and then be waiting for the next run. Naturally, such a model should be kept permanently in the secondary storage ready to load and use instead of creating it and its relationships from the very beginning each time. When programming the persistent object-oriented models in C++, it would be useful to find out the ways of writing the programs on the standard C++ language [Stroustrup1997] rather than using extensions to the C++ like the ObjectStore and POET systems [Piattini2000] or support of class libraries like MFC.

2 OUTLINE OF THE STRUCTURE OF THE PROPOSED OBJECT-ORIENTED MODEL AND C++ DESIGN PATTERN

The possible solution of the problem in point is as follows. The object-oriented model may be represented as a class, say, Model, which main purpose is to be a repository for addresses of objects of the user-defined types. All these types involved in constructing the model are considered to be as classes derived from the common base class, say, Base, like shown at the Booch diagram in Fig. 1.

Figure 1. Generalized scheme of the C++ persistent object-oriented model

The derived classes may, in turn, be the members of other inheritance hierarchy. The dynamically allocated objects of these derived classes are identified by the base-class pointers, which are placed into a container, for instance, the STL deque being a member of Model. An example of the declaration of class Model that would provide minimal functionality is as follows:

The second Model’s data member, typeList, representing a list of types involved in the model is used in the object factory (see, e.g., [Alexandrescu2001] ) when reading data from a secondary storage and creating objects, as well as in the factory-like algorithm when sorting objects of a given type. It is implemented as an associative container, the STL map, with keys of the string type given as type names (or type IDs) defined in the base class. Associated values are objects of type FuncPointers that keep pointers, FuncPCr and FuncPLess, to global functions providing the object creation and a less-than sorting criterion for a given class (e.g., functions CreateA and LessA described in the next section). The member typeList is filled up in the Model’s default constructor.

There are two more Model’s auxiliary data members, fb and pWrite, declared as an object of class basic_filebuf and a pointer to this type, which are defined in the Model’s default constructor as well. This is made for the sake of data safety: the failed attempt to open a file at the stage of creating a Model’s object will not cause the risk of losing any Model’s data. So, the Model’s default constructor is defined as follows:


where A, B, …, Z – types involved in the model. Private helper function OpenToWrite is intended for keeping the file open for writing while the Model ’s object exists. It should contain some exception handling:

Member function SetV adds a new base-class pointer to the array.Definitions of other important member functions and nested function-like structures will be given in the sections 4 and 5.

Thus, each object involved in the model is identified by both the pointer and the corresponding index of the array’s element where this pointer is kept. After finishing a work with the Model, all objects must be stored in the hard disk in such a way that all relationships accomplished by the pointers would be restored correctly when retrieving the data later on. In order to meet this requirement, we have to keep the “one-to-one” relationship between the object’s data member – a pointer to some dynamically allocated object, and the index of the database’s array cell where this pointer is kept.

For solving problems related to model persistence and data processing, the proper scheme of a type name management is developed. It includes type names (actually type identifiers) of type string provided by the Model’s objects and kept in typeList. Duplicate type names clash is excluded in associative arrays such as the STL map. As stated above, typeList is involved in the factory algorithms.

3 DEFINITIONS OF CLASSES INVOLVED IN THE MODEL. OBJECTS’ SERIALIZATION AND READING

Here it’s time to break describing the class Model and to talk about the structures of objects held in the model. The class Base has minimal functionality necessary for making objects persistent and for preserving objects’ relationships. It may be declared as follows:

The class Base contains the data member typeName that is initialized in Base’s parameterized constructor, as well as two dynamic arrays - links and indexes. The array links keeps base-class pointers to other objects to be linked with, providing the association relationships. Just before writing data to the hard disk and terminating the program the links may be emptied because the pointers become invalid. Indexes of the Model’s array v corresponding to the Base’s pointers are kept in the array indexes. That is, if links[j] is equal to v[i] then indexes[j] is equal to i. In contrast with the links, the array indexes is used only when writing and reading data, so it can be empty at almost all run time. For the sake of brevity, we don’t give definitions of the access functions - PutLink, GetLinks etc., they are implemented in a usual way.

The class Base has two non-virtual member functions for writing/reading the type name:

Two virtual member functions Write and Read are intended respectively for the correct serialization and for subsequent creation of transient objects, and are to be overridden in derived classes. Their Base’s definitions are as follows:

Reliable equivalents of the streaming operators << and >> are write and read member functions of the classes ostream and istream applying without delimiters.

As an example of the application class involved in the model, we take the class A of rather general view, namely, having a dynamic array of char (C-style string) and the STL dynamic array - vector of double.


Omitting ordinary routines of accessing, deep copy, etc. and focusing an attention to the correct serialization, we give the definitions of the overridden virtual functions A::Write and A::Read:


We should also note a global helper function CreateA responsible for creating an object of class A in the object factory and called when reading data in the model:

Another helper function that gives a sorting criterion specific for the class A is a global predicate LessA:

It fulfils a static downcast of the arguments and calls a global predicate Less providing the lexicographical comparison of C-style strings.

4 MODEL’S DATA SERIALIZATION AND RETRIEVING

Here we can return to discussing the Model’s member functions, centering on data writing/ reading routines and other related functions. First of all, it is a function for adding new pairs of typeName and functin pointers to the associative array typeList:

Further, let us consider a member function responsible for serialization – the function Model::WriteToFile. At the moment just before writing the model to the secondary storage, the array links of each object being kept in the Model’s array holds base-class pointers to others objects according to some relationship scheme. Elements of the array indexes intended for keeping indexes of the Model’s array v have not been defined so far because at run time the objects can be identified by pointers. Obviously, these pointers will become invalid after computer shutting down, so other objects’ identifiers should be introduced: it could be the indexes of the Model’s array.The initial fragment of the function Model::WriteToFile sets the indexes k of the Model’s array elements v[k] as the elements of the indexes arrays:

The next fragment for writing the objects to a binary file follows:

It writes down the size of the Model’s array first. Then a name of the type is being written just before the corresponding object as shown in the Fig.1, the segment “File structure”. In principle, such a serialization procedure may be just an intermediate saving, and we can continue to work with the model. The filebuf object fb is not closed and a pointer pWrite is not equal to zero.

When we start retrieving the data from a file by calling a function Model::ReadFromFile, first we close the file object fb if it is opened for writing and then fulfill the following operations:

  • Open the file for reading with some kind of exception handling.
  • Model’s array size is being read and the array v is resized.
  • Further, the object factory is organized: within the loop, the object’s type name is being read and an object of the same type is created by default and dynamically allocated due to calling a private helper member function CreateObject.
  • The polymorphic call of the object’s virtual function Read fills the object with the actual content.
  • Model’s array is being filled with base-class pointers to the objects in the order of reading.
  • Relationships between objects are restored with using the Model’s array indexes kept in the indexes arrays.
  • File object fb is closed and re-opened for possible subsequent writing.

The code example for ReadFromFile is as follows:


A helper function CreateObject operating with the list of types finds a proper function pointer to a global function, like CreateA, that, in turn, directly creates an object of a type specified in the argument:

5 MODEL’S DATA MANAGEMENT: SORTING OBJECTS

Objects being kept in the Model’s container should be sorted somehow in order to ensure effective data management. Sorting may be implemented in two stages:

first, sorting objects according to their type names held in the Base’s data member typeName;

second, sorting objects within each type using some keys specified for a given type.

The first step can be done by lexicographical sorting of the type names with a predicate provided by a nested function-like class LessTypeNames. The corresponding function object is used in the Model’s sorting member function SortByTypes. The code example is shown below:

In the second stage, it is required to delimit the objects’subsequences of the same type by using a member function EqualRange:


Having known the delimiters obtained from EqualRange, we can sort objects of the same type in the member function SortInType:

The sorting predicate is provided by a nested function-like class LessInType:

The function-call operator uses the list of types and function pointers for choosing a proper global predicate (e.g., LessA) for sorting objects of a specified type, that is, actually, based on the object factory technique.

6 SUMMARY

The described object-oriented model and the design pattern can represent a basis for developing persistent research models, object-oriented databases, and knowledge-based systems with using standard C++. The model can keep objects of various types, including ones involved in inheritance hierarchies. Objects may form a complex network of association relationships provided by the base-class pointers. This network remains persistent as well. Classes involved into the model must satisfy some requirements such as: be derived from a common base class having facilities for data serialization and management, contain a string data member as a type name identifier and pass it to the base class, define overridden virtual functions for data writing to and reading from a secondary storage, be accompanied with relevant global functions for object creation and for using as a sorting predicate. The corresponding function pointers together with the list of type names are used in the model’s object factories when retrieving data and sorting. Using std::deque instead of std::vector in the base class arrays is a reliable solution. The design pattern provides only restricted data management facilities specific for the model in point, namely, two-stage internal sorting “by types”-“within a type”. Minimal exception handling necessary for the correct program termination is included. Model allows repeated writing and reading in run time. The size of the model is naturally restricted by the virtual memory limits. However, it seems to be likely to use this approach as a basis for developing models of much greater sizes.


7 ACKNOWLEDGEMENTS

The author wishes to thank Mr. Mogens Hansen for very valuable comments and suggestions that helped indeed the author in improving the original idea.

REFERENCES

[Stroustrup1997] Bjarne Stroustrup: The C++ Programming Language, 3rd edition, Addison-Wesley, 1997

[Piattini2000] Advanced Database Technology and Design/ Mario G. Piattini, Oscar Díaz, editors, Artech House, 2000

[Alexandrescu2001] Andrei Alexandrescu: Modern C++ Design: Generic Programming and Design Patterns Applied, Addison-Wesley, 2001

About the author



  Alexander Kozynchenko is a lecturer on computer science at the Mid Sweden University. He had obtained the honours diploma of electrical and mechanical engineer in 1974, the degree of candidate of science on aircraft control systems in 1982, and the rank of senior researcher in 1987. E-Mail: alexander.kozynchenko@miun.se

Cite this document as follows: Alexander Kozynchenko: "Constructing persistent object-oriented models with standard C++", in Journal of Object Technology, vol. 5, no. 1, January–February 2006, pages 69-81, http://www.jot.fm/issues/issues 2006 1/article2


Previous article

Next article