Obfuscation and .NET

Dr. Richard Wiener, Editor-in-Chief, JOT, Associate Professor of Computer
Science, University of Colorado at Colorado Springs

COLUMN

PDF Version

Assemblies generated under .NET may be decompiled into easily recognized source code that is either identical or similar to the original source code. Individuals or companies deploying .NET generated assemblies that are targeted for client machines may unwittingly be distributing their source code – in most cases an unintended consequence. Since source code is generally considered a valuable intellectual asset, measures should be taken to prevent decompilation into easily recognized source code.

Obfuscator software is aimed at making decompilation into easily recognized source code very difficult. This article examines the issue of obfuscation under .NET and reviews two obfuscator products. The test suite used to evaluate the two obfuscator products are applications taken from the forthcoming book by Richard Wiener entitled Modern Software Development Using C#/.NET (to be published by Course Technology in late 2005).

The Nature of .NET Assemblies

Assemblies in .NET consist of two major components: metadata and intermediate language code. The reflection capabilities of .NET languages and the metadata features of an assembly that include its classes and associated method signatures, fields, properties and events make it possible to reverse engineer and retrieve this intellectual property. The intermediate language code provides information regarding the details of each method implementation. This enables well constructed decompilers to reveal the details of proprietary algorithms that you would typically not want users to have access to.

To illustrate this, consider the following class, in Listing 1, designed to perform generic sorting using the simple and relatively inefficient selection-sort algorithm.

Listing 2 contains the decompiled source code obtained using the respected and widely used Lutz Roeder’s C# .NET Decompiler (http://www.aisto.com/roeder/dotnet/). The decompiled language chosen is C#, the same as the originating language.

Listing 3 contains the decompiled source code in Visual Basic .NET obtained from the assembly produced by the original C# source code.

Although not perfect, the Roeder .NET Decompiler serves as a simple language translator since it effectively converts C# source code to Visual Basic source code (something that the author of this article would not dare attempt because of his lack of familiarity with Visual Basic).

Listing 1 – Original C# source code for generic sorting

Listing 2 – Decompiled source using Lutz Roeder’s Reflector Decompiler

Listing 3 – Decompilation into Visual Basic .NET

Although the decompiled C# source code is not identical to the original it is close enough to reveal all its essential details. The namespace, class names and method names and their signatures were decompiled perfectly. These are the features that define the architecture of the software system. The details of method SelectionSort reveal the algorithm used.

For users who prefer Delphi Pascal, the Lutz Roeder’s decompiled code in Listing 4 provides the details of the generic SelectionSort procedure in Delphi Pascal.

Listing 4 – Decompiled SelectionSort in Delphi Pascal

Moderate size applications containing dozens of classes and thousands of lines of source code have been decompiled with equal success using the Roeder’s .NET decompiler.

As a teacher of computer science, I often post my solutions (.NET assembly) on my university website for major projects that are assigned to my students. These solutions provide an additional specification to the students about how the system they are to design and implement should function. A major part of these projects is defining the architecture of the solution (class names, their features and interrelationships). The ability for students to decompile my posted assembly and reverse engineer not only the architecture of my solution but even its fine details requires that I run my assembly through a competent obfuscator before posting it to my website. This obfuscator should make the architecture and details of my solution very difficult to decipher.

Clearly the same need exists for the commercial deployment of .NET assemblies.

The Nature of Obfuscation

An ideal obfuscator mangles the large features (namespaces, class names, method signatures and fields) and small features (method details and in particular string values defined as fields and within a method) of your assembly without changing its functionality. The more aggressive the mangling, the more likely that the obfuscated assembly will not run the same way as the original assembly. It is essential that an obfuscator keep the functionality of the software totally intact while making the original source code unrecognizable if the obfuscated assembly is decompiled.

There are several well known problem areas that a well designed obfuscator must allow the user to deal with. When runtime type identification is used in an application to determine whether an object is of a particular type, the class name of the type being sought is used. If the obfuscator has mangled this class name, the dynamic type identification will fail in the obfuscated assembly. The user of the obfuscator must be given the option of selecting features that are not to be mangled. These include class, field and method names. The same issue exists when dynamic class loading is performed within the assembly or serialization or remoting is used.

One of the side-effects of obfuscation is the difficulty of debugging obfuscated code. An exception that is generated and reported by a user will typically include mangled method and class names making it almost impossible to identify the root cause. Providing a clearly labeled map file in the obfuscation tool is essential to the user in interpreting debugger output from the obfuscated assembly.

Hackers often search deployed assemblies for strings that contain keywords such as “Password” or “Enter password”. By locating such strings, hackers attempt to circumvent the password protection embedded in the product that they are hacking. Some obfuscators provide the option of string encryption. Although this may introduce a small performance penalty because of the need to decrypt strings at runtime, the overhead associated with such string encryption and decryption is often negligible.

Some obfuscator products include the ability to statically analyze the application and determine the parts that are not being used. This includes unused types, unused methods, and unused fields. This could be of great benefit if memory footprint is a concern.

Some obfuscators provide control of flow obfuscation. Control flow obfuscation typically introduces false conditional statements and other misleading constructs in order to confuse decompilers. Some obfuscators destroy the code patterns that decompilers use to recreate source code. The trick is to confuse the decompiler without changing the functionality of the obfuscated assembly.

Incremental obfuscation allows the developer to make changes to the original sources after releasing an obfuscated assembly and then provide a patch to the user that reflects the changes to the original application while preserving the name-mapping used in the original release. In order to accomplish this, a map file must be saved and later used to ensure that the renaming is preserved when making changes and re-releasing the obfuscated assembly. Some obfuscator products support this useful capability.

Finally, some obfuscators enable the user to embed watermarks such as user names and registration codes into the internal binary structures within the assembly. Watermarking can assist in tracking distribution of the product on a per-executable basis if it is illicitly distributed or obtained.

Several well known obfuscator products were sought and obtained for this review. Only two survived the rigorous tests that each obfuscator was subjected to. The obfuscators that failed generally produced assemblies that would not run or did not provide sufficient features to be of use for deploying commercial obfuscated assemblies.

Only .NET 2005 (beta 1) assemblies were used in all the tests.

The two products that survived these tests and will be the subject of this review are:

Dotsfucator Professional Edition, Version 3 (RC)
http://www.preemptive.com/
PreEmptive Solutions
26250 Euclid Avenue
Suite 503
Cleveland, Ohio 44132
Voice: 216.732.5895
General Email: information@preemptive.com
Salamander, RemoteSoft .NET Explorer
http://www.remotesoft.com/
Dr. Huihong Luo
Tel: 1-510-579-2752
Sales information:
sales@remotesoft.com

Dotfuscator – Professional Edition, Version 3 (RC)

This is an outstanding product in every respect. It is obvious why Microsoft has chosen to bundle this product (a light version) with .NET 2003 and more recently .NET 2005 (beta). Assemblies can be obfuscated within Visual Studio using the Tools menu or using the stand-alone GUI version. Both provide equivalent functionality. I used the stand-alone GUI version for most of my testing.

A detailed on-line user’s guide provides detailed information about the use of the product. This user’s guide is well written and useful. The only item that I found missing from this user’s guide was information about how to detect the watermark that the user may optionally embed in the obfuscated assembly. An e-mail to Preemptive Solutions got me an answer within 10 minutes.

Dotfuscator is a full-feature obfuscator product. It supports all of the facilities described in the previous section. Of particular interest is the “overload induction engine”. Using consecutive letters of the alphabet, Dotfuscator attempts to legally overload these letters when transforming identifier names from the original to obfuscated assembly. Clearly this obfuscated assembly deserves a grade of “A”!

Consider its work on obfuscating the assembly produced by Listing and observe the “a’s”. A portion of the decompiled obfuscated assembly is given in Listing 5.

Listing 5 – Decompiled Dotfuscated Assembly from Listing 1 (Generic Sorting)

Of particular benefit is the Dotfuscator option to encrypt string values within an assembly.

Consider the simple C# application given in Listing 6 that prompts the user for a password. The password itself is stored as a field of class PassWordApplication (not a good practice but done here to dramatize the value of string encryption).

Listing 6 – C# Application that Prompts for Password

Listing 7 presents the decompiled source listing, again using Reflector, after the source in Listing 6 is obfuscated without string encryption.

Listing 7 – Decompiled Version of Listing 6

Clearly the decompiled source code informs the user not only about the section of code that requests a password but in this case the password itself. There would not be much value in having password protection in this case or obfuscation.

As mentioned before, Dotfuscator includes an option for encrypting strings within the assembly. After exercising this option, the new decompiled source code is shown in Listing 8.

Listing 8 – Decompiled Version of Listing 6 Using String Encryption

Clearly the string encrypted version is much better protected. It is not at all obvious that the decompiled code relates in any way to password solicitation. It should be clear that the string encryption feature of Dotfuscator is invaluable and indispensable.

Preserving features prior to obfuscation is of fundamental importance. As indicated earlier, this is essential before obfuscating assemblies that perform dynamic class loading or utilize runtime type identification. There are other motivations for preserving features (the use of serialization and remoting). Dotfuscator provides an outstanding graphical user interface that allows the user to select for preservation a class name or individual fields or methods within a class. Each of these features is presented in a tree that shows the fine-grained features of the class (class name, method names and field names). Associated with each feature is a checkbox that when selected prevents that particular feature’s name from being mangled. After extensive testing I am pleased to report that this important feature works exactly as advertised.

As a useful security measure, Dotfuscator allows the user to embed watermarking within the obfuscated assembly. Using a command-line utility, Premark, the watermark, if present, can be retrieved from the obfuscated assembly.

Dotfuscator produces map files in XML format. A sample map file, in this case associated with the generic sorting application presented in Listing 1, is shown in Listing 9.

Listing 9 – Map file associated with Obfuscation of Listing 1

After spending several days testing and using Preemptive Solutions Dotfuscator (http://www.preemptive.com/), I can report that this product presents an outstanding role-model in product design and implementation. Its user-interface is simple and clean providing for a high degree of usability. Although I did not choose to do so, many users will appreciate its integration into Visual Studio. Its many important fine-grained features most important of which include its overload induction engine (heavy overloading of single-character identifiers), string encryption option, fine-grained control of features that are to be preserved, ability to provide the customer incremental patches, simple and useful map file and its ability to embed water marks make this full-featured product first in its class.

Salamander RemoteSoft .NET Explorer, Version 2.0

Dr. Huihong Luo, the architect of the Salamander .NET Explorer, has been extremely helpful and responsive to questions sent during this review. I would expect customers to enjoy a high degree of support and satisfaction if they were to run into any problems while using the .NET Explorer.

The .NET Explorer product is both a capable decompiler and obfuscator. The focus in this review is on its obfuscation features.

Listing 10 shows the decompiled source code after obfuscating Listing 1 (the generic sort) using the Salamander .NET Explorer obfuscator (henceforth referred to as the Salamander obfuscator).

Listing 10 – Decompiled Source Code After Using the Salamander Obfuscator

Here, C# names were chosen as the base-set for name mangling (the identifiers in the original C# source file converted to standard C# names). You may wish to compare Listing 10 with Listing 5.

The Salamander obfuscator does not support string encryption. After obfuscating and decompiling Listing 6, the source code is shown in Listing 11.

Listing 11 – Decompiled Version of Listing 6 Using Salamander Obfuscator

This example once again highlights the importance of being able to achieve string encryption.

Salamander does not provide the same degree of control as Dotfuscator in determining which features are not to be transformed (mangled) in building the obfuscated assembly. Class names may be chosen for preservation, method names or all public members and all fields may be selected for preservation. For many purposes this is adequate. Identifier preservation is achieved using a context menu that appears when right-mouse clicking the feature that you wish to preserve.

The on-line documentation, although adequate, is less detailed than the documentation provided in Dotfuscator. In part, this is because Salamander contains fewer features.

The log file (Dotfuscator’s equivalent of a map file) produced in connection with the generic sort is shown in Listing 12. It provides important and useful information to the user.

Listing 12 – Salamander Log File after Obfuscating Listing 1

Salamander does not currently support the embedding of water marks in its obfuscated assemblies.

One of Salamander’s major features, Protection, could not be evaluated since it does not currently work with .NET 2005 assemblies. Dr. Luo has indicated that this important capability will be working under .NET 2005 in several months.

The following claim is made on RemoteSoft’s website (http://www.remotesoft.com/salamander/protector.html)

“Our protector is not an obfuscator, rather it converts the decompilable Microsoft Intermediate Language code (MSIL or CIL) of your assemblies into native format while keeping all .NET metadata intact, and thus it provides the same level of protection as native C/C++ code. Further more, it offers code, string and resource encryption, and therefore, it provides even better protection than native C/C++ code. ”

I look forward to testing and later reporting more about this capability using the suite of .NET 2005 assemblies that were used in this review.

A table that summarizes the major features of Dotfuscator and Salamander .NET Explorer is shown below.

Major Features of Dotsfucator and Salamander .NET Explorer Obfuscators

Feature Map file Incremental
obfuscation
Preservation of user-selected features String encryption Water marks Integration
With Visual Studio

Dotfuscator x x x x x x

Salamander x x x

About the author

Richard Wiener is Associate Professor of Computer Science at the University of Colorado at Colorado Springs. He is also the Editor-in-Chief of JOT and former Editor-in-Chief of the Journal of Object Oriented Programming. In addition to University work, Dr. Wiener has authored or co-authored 21 books and works actively as a consultant and software contractor whenever the possibility arises. His latest book, to be published by Course Technology in late 2005, is entitled Modern Software Development Using C#/.NET.

Cite this column as follows: Richard Wiener: “Obfuscation and .NET", in Journal of Object Technology, vol. 4, no. 4, May-June 2005, pp. 73-92 http://www.jot.fm/issues/issue_2005_05/column6

Previous column

Next article