niceideas.ch
Technological Thoughts by Jerome Kehrli

Bytecode manipulation with Javassist for fun and profit part I: Implementing a lightweight IoC container in 300 lines of code

by Jerome Kehrli


Posted on Monday Feb 13, 2017 at 09:30PM in Java


Java bytecode is the form of instructions that the JVM executes.
A Java programmer, normally, does not need to be aware of how Java bytecode works.

Understanding the bytecode, however, is essential to the areas of tooling and program analysis, where the applications can modify the bytecode to adjust the behavior according to the application's domain. Profilers, mocking tools, AOP, ORM frameworks, IoC Containers, boilerplate code generators, etc. require to understand Java bytecode thoroughly and come up with means of manipulating it at runtime.
Each and every of these advanced features of what is nowadays standard approaches when programming with Java require a sound understanding of the Java bytecode, not to mention completely new languages running on the JVM such as Scala or Clojure.

Bytecode manipulation is not easy though ... except with Javassist.
Of all the libraries and tools providing advanced bytecode manipulation features, Javassist is the easiest to use and the quickest to master. It takes a few minutes to every initiated Java developer to understand and be able to use Javassist efficiently. And mastering bytecode manipulation, opens a whole new world of approaches and possibilities.

The goal of this article is to present Javassist in the light of a concrete use case: the implementation in a little more than 300 lines of code of a lightweight, simple but cute IoC Container: SCIF - Simple and Cute IoC Framework.

A new version of comet-tennis demo app with the SCIF framework integrated is available here.

Part of this article is available as a slideshare presentation here: https://www.slideshare.net/JrmeKehrli/bytecode-manipulation-with-javassist-for-fun-and-profit.

You might also want to have a look at the second article in this serie available here : Bytecode manipulation with Javassist for fun and profit part II: Generating toString and getter/setters using bytecode manipulation.

Summary

1. Introduction / Purpose

Bytecode manipulation consists in modifying the classes - represented by bytecode - compiled by the Java compiler, at runtime. It is used extensively for instance by frameworks such as Spring (IoC) and Hibernate (ORM) to inject dynamic behaviour to Java objects at runtime.
But first let's look at a very summarized view of the Java toolchain to remind a few concepts:


Java source files are compiled to Java class files by the Java Compiler. These Java classes take the form of bytecode. This bytecode is loaded by the JVM to execute the Java program.
In principle the bytecode is read only and cannot be change once loaded. That is true, but:

  • The java classes bytecode can be modified before being loaded by the classloader through the usage of an agent
  • Classes bytecode can be modified at runtime without an agent as long as the class has not been loaded yet by a classloader.
  • Classes can be generated entirely dynamically at runtime using bytecode manipulation techniques

In this article we'll dig into the library Javassist which is a bytecode manipulation framework that can help in achieving all of the above mechanisms.
But before, let's describe three different, unrelated but complementary techniques: Introspection, Reflection and Bytecode Manipulation.

1.1 Runtime Reflection

Runtime reflection is the ability of a computer program to examine, introspect, and modify its own structure and behavior at runtime.

Reflection is commonly used by programs which require the ability to examine or modify the runtime behavior of applications running in the Java virtual machine.
Reflection is a powerful technique and can enable applications to perform operations which would otherwise be impossible.

The ability to examine and manipulate a Java class from within itself may not sound like very much, but in other programming languages this feature simply doesn't exist. For example, there is no way in a Pascal, C, or C++ program to obtain detailed information about the functions defined within that program.

In java there is no specific introspection API available natively. Introspection is performed as well using the Java Reflection API.

Still, conceptually, introspection and reflection are different things:


Type introspection is the ability of a program to examine the type or properties of an object at runtime.
This is short example of Type Introspection in Java, where we discover the fields of an object and show their values dynamically. Again, Introspection in Java is really done using the Reflection API

import java.lang.reflect.Field;

public class TestIntrospection {

    public static class TestData {
        private int i = 0;
        private String myString = "abc";
        private long value = -1;
    }

    // test Introspection
    public static void main (String[] args) {
        try {
            // Using Introspection, we really don't care of the actual type
            Object td = new TestData();

            // List fields of TestData and get their values
            for (Field field : td.getClass().getDeclaredFields()) {
                field.setAccessible(true); // just make private fields accessible

                System.out.println (field.getName() + "=" + field.get(td));
            }
        } catch (IllegalAccessException e) {
            e.printStackTrace();
        }
    }
}

Runtime Reflection is a native feature in the Java programming language. It allows an executing Java program to examine or "introspect" upon itself, and manipulate internal properties of the program.
A short example could be as follows, where we change the value of a field using reflection:

import java.lang.reflect.Field;

public class TestReflection {

    public static class TestData {
        private int i = 0;
        private String myString = "abc";
        private long value = -1;
    }

    // test Reflection
    public static void main (String[] args) {
        try {
            // Using Reflection, we really don't care of the actual type
            Object td = new TestData();

            // Change the value of the field myString
            Field myStringField =  td.getClass().getDeclaredField("myString");
            myStringField.setAccessible(true); // just make private fields accessible
            myStringField.set(td, "xyz");

            System.out.println (myStringField.getName() + "=" + myStringField.get(td));

        } catch (NoSuchFieldException | IllegalAccessException e) {
            e.printStackTrace();
        }
    }
}

Runtime Reflection is a very powerful feature of the JVM.
I wrote a previous article on this very blog showing how to dynamically add values to an Enum Type in Java using Runtime Reflection.

Why are those important in the scope of bytecode manipulation and Javassist ?

Runtime Reflection is important in our context for two reasons:

  1. First, because Javassist attempts to keep an API as close as possible to the Java Runtime Reflection API as a way to appear as natural as possible to Java developers.
  2. Second, and this is maybe more important, because behaviour injected in Java Classes using bytecode manipulation is not known by the compiler. Thus, it is sometimes only available through runtime reflection.

1.2 Bytcode manipulation

Bytecode manipulation allows the developer to express instructions in a format that is directly understood by the Java Virtual Machine, without passing from source code to bytecode through compiler.
Bytecode is somewhat similar to assembly code directly interpretable by the CPU. But with Java the bytecode is, first, interpreted by a Virtual Machine, the JVM, and second much more understandable that assembly code.

One might wonder why one would want to get interested in bytecode manipulation and generation. As a matter of fact, every java developer has likely already been using bytecode manipulation all over the place without knowing it.
Since the JVM can modify bytecode and use new bytecode while it is running, this generates a whole new universe of languages and tools that by far surpasses the initial intent of the Java language.


Bytecode manipulation use cases

Some examples are:

  • ORM frameworks such as Hibernate use bytecode manipulation to inject, for instance, relationship management code (lazy loading, etc.) inside mapped entities.
  • FindBugs inspects bytecode for static code analysis
  • Languages like Groovy, Scala, Clojure generate bytecode from different source code.
  • IoC frameworks such as Spring use it to seamlessly weave your application lifecycle together
  • language extensions like AspectJ can augment the capabilities of Java by modifying the classes that the Java compiler generated
  • etc.

The Java platform provides you with many ways to work with bytecode, for instance:

  • One can write his own compiler for any kind of new and crazy language
  • One can generate on the fly sub-classes of already loaded classes and use them instead of original classes to get additional behaviour
  • One can write an instrumentation agent that plugs right into the JVM and modifies behaviour of classes before they are loaded by the classloader
  • etc.

With so many options available, one of these will certainly fit any experiment that one wants to play around with. With bytecode manipulation, one really gets the whole power of the JVM for free and the capacity to slot in any idea exactly where it's needed while reusing the rest of the Java platform.

From my perspective, this is what excites me the most, as a developer I can really focus on my crazy idea that is not supported by the Java language and I don't have to write an entire platform to make it come to life.
Certainly this has been one of the key areas why the Java community has constantly been experimenting with new ways to push the programming toolset further.

This article wont present the details of the Java bytecode any further. We'll focus instead on high level libraries aimed at manipulating the Java Bytecode.
Should you be interested in the low level details, I can only recommend that you read this excellent paper from ZeroTunaround, the guys behind JRebel.

Most common bytecode manipulation libraries

As a matter of fact, while Runtime Reflection is supported natively by the JVM, Bytecode manipulation, on the other hand, is fairly difficult to achieve without the usage of a specific library

The most common bytecode manipulation libraries in Java are as follows:

  • ASM s a project of the OW2 Consortium. It provides a simple API for decomposing, modifying, and recomposing binary Java classes. ASM exposes the internal aggregate components of a given Java class through its visitor oriented API. ASM also provides, on top of this visitor API, a tree API that represents classes as object constructs. Both APIs can be used for modifying the binary bytecode, as well as generating new bytecode
  • BCEL provides a simple library that exposes the internal aggregate components of a given Java class through its API as object constructs (as opposed to the disassembly of the lower-level opcodes). These objects also expose operations for modifying the binary bytecode, as well as generating new bytecode (via injection of new code into the existing code, or through generation of new classes altogether).
  • CGLIB is a powerful, high performance and quality Code Generation Library, it is used to extend JAVA classes and implements interfaces at runtime. CGLIB is really oriented towards implementing new classes at runtime, as opposed to modifying existing bytecode such as other libraries.
  • Javassist is a Java library providing a means to manipulate the Java bytecode of an application. In this sense Javassist provides the support for structural reflection, i.e. the ability to change the implementation of a class at run time.

Javassist is much easier to use that lower level libraries such as BCEL or ASM. It is also less limited and more powerful than CGLIB.

This chart shows in addition the AspectJ framework, as a way to have the user get an understanding of the level of abstraction provided by these tools:


Now, as you might have guessed from its title, this rest of this article will focus on Javassist.

2. Javassist

From the Javassist web site:

"Javassist (Java Programming Assistant) makes Java bytecode manipulation simple. It is a class library for editing bytecode in Java; it enables Java programs to define a new class at runtime and to modify a class file when the JVM loads it.
Unlike other similar bytecode editors, Javassist provides two levels of API: source level and bytecode level. If the users use the source-level API, they can edit a class file without knowledge of the specifications of the Java bytecode.
The whole API is designed with only the vocabulary of the Java language. You can even specify inserted bytecode in the form of source text; Javassist compiles it on the fly. On the other hand, the bytecode-level API allows the users to directly edit a class file as other editors."

The fact that Javassist is presented above as being able to modify classes at loading time is not a limitation of the Javassist framework itself, but rather a consequence from the linking system of the JVM. Once a class has already been loaded, changing it would result in a Linkage Error (unless the JVM is launched with the JPDA [Java Platform Debugger Architecture] enabled, which would make a class dynamically reloadable, but that is another story).
Interestingly, Javasssist is perfectly able to modify a class long after the application has started as long as that specific class has not been loaded.
This is just to emphasize that Javassist can perfectly be used to modify classes at runtime and not only at "pre-main" time by the usage of a JVM agent, even though this suffers from great constraints.

In my opinion, the great strength of Javassist over its competitors is that Javassist enables the user to generate bytecode on the fly from actual Java code given to it in the form of a string by calling the Java Compiler on the fly on such strings.
And that is freaking awesome.

2.1 Javassist purpose and behaviour

Javassist provides the developer with a high level API around classes, methods, fields, etc. aimed at making it as easy as possible to change the implementation of existing classes or even implement completely new classes, dynamically, at runtime, using bytecode manipulation.

(The following is explained in more details in the official Javassist tutorial.)

The most important elements of the Javassist API are presented on the schema below:


The class Javassist.CtClass is an abstract representation of a class file. A CtClass (compile-time class) object is a handle for dealing with a class file. The following program is a very simple example:

(In all examples of code from now on, I will be coloring relevant Javassist API calls in dark red)

ClassPool pool = ClassPool.getDefault();
CtClass cc = pool.get("test.Rectangle");
cc.setSuperclass(pool.get("test.Point"));
cc.writeFile();

ClassPool

This program first obtains a ClassPool object, which controls bytecode modification with Javassist. The ClassPool object is a container of CtClass object representing a class file. It reads a class file on demand for constructing a CtClass object and records the constructed object for responding later accesses.

To modify the definition of a class, the users must first obtain from a ClassPool object a reference to a CtClass object representing that class.get() in ClassPool is used for this purpose.
In the case of the program shown above, the CtClass object representing a class test.Rectangle is obtained from the ClassPool object and it is assigned to a variable cc. The ClassPool object returned by getDefault() searches the default system search path.

CtClass

The CtClass object obtained from a ClassPool object can be modified.
In the example above, it is modified so that the superclass of test.Rectangle is changed into a class test.Point. This change is reflected on the original class file when writeFile() in CtClass() is finally called.

writeFile() translates the CtClass object into a class file and writes it on a local disk. Javassist also provides a method for directly obtaining the modified bytecode. To obtain the bytecode, call toBytecode():

byte[] b = cc.toBytecode();

(Bear in mind that this is especially useful when implementing a Java agent)

You can directly load the CtClass as well:

Class clazz = cc.toClass();

A class can be returned to the pool, making it available to the classloader and hence the whole application:

pool.toClass(cc, Thread.currentThread().getContextClassLoader(), null);

Defining a new class

To define a new class from scratch, makeClass() must be called on a ClassPool.

ClassPool pool = ClassPool.getDefault();
CtClass cc = pool.makeClass("Circle");
cc.setSuperclass(pool.get("test.Point"));

This program defines a class Circle including no members except those inherited by the parent class Point. Member methods of Circle can afterwards be created with factory methods declared in CtNewMethod and appended to Circle with addMethod() in CtClass.
makeClass() cannot create a new interface; makeInterface() in ClassPool can do. Member methods in an interface can be created with abstractMethod() in CtNewMethod. Note that an interface method is an abstract method.

Implementing / Modifying a class

Methods are represented by CtMethod objects. CtMethod provides several methods for modifying the definition of the method. Note that if a method is inherited from a super class, then the same CtMethod object that represents the inherited method represents the method declared in that super class. A CtMethod object corresponds to every method declaration.
Constructors are represented by their very own type in Javassist : CtConstructor. Both CtMethod and CtConstructor extends the same base class and have a lot of their API in common.

Javassist does not allow to remove a method or field, but it allows to change the name. So if a method is not necessary any more, it should be renamed and changed to be a private method by calling setName() and setModifiers() declared in CtMethod, for instance to hide it. But beware of linkage errors at runtime if you mess with a method used by another class.

CtMethod and CtConstructor can be used to completely implement / rewrite a constructor or a method from scratch. They also provide methods insertBefore(), insertAfter(), and addCatch(). They are used for inserting a code fragment into the body of an existing method.

When implementing or rewriting completely a method from scratch, using CtNewMethod.make() is in my opinion the most convenient approach. It enables the developer to implement a method by providing Java Source Code syntax in a simple string.
For instance:

CtClass point = ClassPool.getDefault().get("Point");
CtMethod m = CtNewMethod.make(
        "public int xmove(int dx) { x += dx; }",
        point);
point.addMethod(m);

CtNewMethod provides a lot of high level methods for implementing getters, setters and other commodity methods directly, sometimes even without having to bother providing an implementation on your own.

Some pretty complete information in this regards is available on the Second official javassist tutorial.

2.2 A gentle example with Javassist

We'll see now a simple and yet complete example using Javassist. We'll implement the getters and setters for the fields of the class TestData introduced here using bytecode manipulation.
Then we'll test the getter and setter for the field myString. Since these getters and setters are injected using bytecode manipulation at runtime, we'll have to use runtime reflection to call them:

(Reminder, I am coloring relevant Javassist API calls in dark red)

import javassist.*;

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

public class TestJavassist {

    public static class TestData {
        private int i = 0;
        private String myString = "abc";
        private long value = -1;
    }

    // test Javassist
    public static void main (String[] args) {
        try {

            ClassPool cp = ClassPool.getDefault();
            CtClass clazz = cp.get("ch.niceideas.common.utils.TestJavassist$TestData");

            for (CtField field : clazz.getDeclaredFields()) {
                String camelCaseField = field.getName().substring(0, 1).toUpperCase()
                        + field.getName().substring(1);

                // We don't need to mess with implementation here. CtnewMethod has a
                // commodity method to implement a getter directly
                CtMethod fieldGetter = CtNewMethod.getter("get" + camelCaseField, field);
                clazz.addMethod(fieldGetter);

                // Just for the sake of an example, we'll define the setter by actually
                // providing the implementation, not using the commodity method offered
                // by CtNewMethod
                CtMethod fieldSetter = CtNewMethod.make(
                            "public void set" + camelCaseField + " \n" + 
                            "    (" + field.getType().getName() + " param) { \n" +
                            "    this." + field.getName() + " = param; \n" +
                            "}",
                        clazz);
                clazz.addMethod(fieldSetter);
            }

            // Save class and make it available
            cp.toClass(clazz, Thread.currentThread().getContextClassLoader(), null);

            // Now instantiate a new TestData
            TestData td = new TestData();

            // Get the value of the field 'myString' using the newly defined getter
            Method getter =  td.getClass().getDeclaredMethod("getMyString");
            System.out.println (getter.invoke(td));

            // Change the value of field 'myString' using newly defined setter
            Method setter =  td.getClass().getDeclaredMethod("setMyString", String.class);
            setter.invoke(td, "xyz");

            // Get the value again
            System.out.println (getter.invoke(td));

        } catch (  NotFoundException | CannotCompileException | NoSuchMethodException
                 | IllegalAccessException | InvocationTargetException e) {
            e.printStackTrace();
        }
    }
}

3. IoC

OK. Since the example code I want to implement below with Javassist is a simple IoC container, I guess I should present what IoC actually is beforehand.

Inversion of Control is a design pattern related to lifecycle management of components in an application benefiting from a services architecture.
In such an application, business components are usually implemented in the form of various services, such as business services, business managers, DAOs, etc. The main class delegates specific business concerns to business services, which delegate finer aspects in their turn to managers, which further delegate various business of technical aspects to smaller managers, or DAOs, adapters. etc.

These various services need to know about each others to be able to call each others. Managing the construction and instantiation of these services is called components lifecycle management.

Very often, business services are stateless components, not keeping any state in instance variables or else. Traditionally, for a very long time, these stateless services have been implemented as singletons. For a very long time this was a very convenient approach since the main singleton simply needs to get the other singletons it was using, which in turn simply needed to get the other singletons they were using, and so on.
By separating the instantiation of singletons and their initialization in two stages, cycles could be handled easily and everyone was happy.
But with the rise of XP and unit testing, singleton-based applications were suffering from a very critical drawback:

Singletons were enforcing strict dependencies on other service implementation at compile time, making it pretty impossible to replace the dependencies by mock objects or stubs as required for efficiently unit testing a specific service.
Using singletons, testing a specific service often meant to be required to completely build and initialize the whole application, which can well turn into a nightmare.

Java EE was of course not an answer to this problem since it required to have a Java EE container, which is even more of a nightmare (well let's not get me started on Java EE, shall we ?).

Inversion of Control

Inversion of control is initially mostly an answer to this problem, as a way to increase the modularity of the application and making it more extensible, and more importantly testable in an easier way by removing the strict dependencies between components.
The key idea is to delegate the management of the lifecycle of the components and the injection of each component's dependencies to a framework, or rather a container, borrowing the term from Java EE, called here a lightweight container.

Instead of every component getting each other references on the singletons or specific instances by building them, the container takes care of instantiating the components, managing their lifecycle in the required scope, and injecting their dependencies at runtime.
Injecting the dependencies at runtime, with a configurable approach, using a configuration file, annotations or even a dedicated API, opens the possibility to inject a different implementation of a service depending on the context, as long as it respects the required interface.
For instance, injecting a mock object instead of the real deal for unit testing becomes straightforward.

Inversion of Control and Dependency Injection are two different things - and yet strongly related to each other - often confused in some documentation:

  • Inversion of Control - IoC : is the name of the design pattern, the approach. It is considered a design pattern, which, in my opinion, is wrong! IoC is an architecture pattern. But yeah that is really no big deal.
  • Dependency Injection - DI : is the name of a technique, a mechanism on which IoC often relies to take place. It consists in injecting the components required by a specific component at runtime, based on some configuration rules. DI is really just one aspect of IoC.

3.1 IoC history

Inversion of Control, as a term, was popularized in 1998 by the Apache Avalon team trying to engineer a "Java Apache Server Framework" for the growing set of server side Java components and tools at Apache.
To the Avalon team, it was clear that components receiving various aspects of component assembly, configuration and lifecycle was a superior design to those components going at getting the same themselves.

Later, the authors of the "Java Open Source Programming" book wrote XWork and WebWork2 to support their forthcoming book. Their concepts were very much like those from IoC/Avalon, but dependencies were passed into the component via setters. The need for those dependencies was declared in some accompanying XML.
That was actually the first IoC framework close to the form they have today.

In 2002, Rod Johnson, leader of the Spring Framework wrote the book "Expert One-on-One : J2EE Design and Development" that also discussed the concepts of setter injection, and introduced the codeline that ultimately became the Spring Framework at SourceForge in February of 2003.
I myself discovered the concept in 2003 when reading that book, following the advise of Mr. Parick Gras, which I take the opportunity to thank a lot for this here.

This whole history is presented in details on PicoContainer / Inversion of Control History and can be represented as follows:


3.2 IoC Principle

As stated above, in a usual application, the lifecycle of components starts with a main component (or class) that either creates the other services it requires or get their singletons.
These other components, in their turn, create or get references on their own dependencies, and so on.


With IoC, a container, called lightweight container - as opposed to Java EE craps that are very heavy (and very bad) containers - takes care of instantiating and managing the lifecycle of the components as well as, more importantly, injecting the dependencies in every component.


3.3 Various frameworks

The most important IoC containers today are the following:

  • The Spring Framework is an application framework and inversion of control container for the Java platform. The core of spring is really about IoC and components management but nowadays there is a complete ecosystem of tools and side frameworks around spring core aimed at developing web application, ORM concerns, etc.
  • The Pico Container is a very lightweight IoC Container and only that. Unlike spring, it is designed to remain small and simple and targets only IoC concerns, nothing else. It is not heavily maintained.
  • Apache Tapestry is an open-source component-oriented Java web application framework conceptually similar to JavaServer Faces and Apache Wicket. It provides IoC concerns in addition to the web application framework.
  • Google Guice is an open source software framework for the Java platform released by Google. It provides support for dependency injection using annotations to configure Java objects.

4. SCIF : Simple and Cute IoC Framework

The rest of this article is dedicated to present the implementation of a very simple IoC framework using Javassist, rather as a way to illustrate how easy and straightforward that is with Javassist than for any other reason :-)
Implementing Dependency Injection is actually a state-of-the-art use case for Javassist and a nice way to present the possibilities and whereabouts of bytecode manipulation.
We'll see now how to use Javassist in the light of a concrete use case: the implementation in a little more than 300 lines of code of a lightweight, simple but cute IoC Container: SCIF - Simple and Cute IoC Framework.

4.1 Principle

SCIF - the system we want to build - is an MVP (Minimum Viable Product). We want it to implement Dependency Injection in its simplest form:

  • Services are managed by the framework and stored in a Service Registry
  • Services should declare the annotation @Service
  • to be discovered by the framework. The framework searches for services declaring this annotation in the classpath.
  • Dependencies are identified in services using the annotation @Resource. The framework analyze services to discover about their dependencies at runtime.
  • If @Resource is declared on a field, the framework injects the dependency directly, at build time.
  • If @Resource is declared on a getter, the framework uses bytecode manipulation to override the getter in a subclass and implement lazy loading of the dependency.

In case of getter (property) injection instead of field injection, SCIF is forced to generate a sub-class of the initial class and override the getter in that sub-class to implement lazy-loading.
This is a consequence of the usage of an annotation to identify the services to be enhanced: in order for the framework to be able to query the annotation on a class, that class needs unfortunately to be loaded by the Classloader. Javassist is not able to change a class once that class has been loaded (well at least not easily).
Changing a class that is already loaded leads unusually to a linkage error and that is forbidden without usage of very advanced techniques, too difficult for this simple framework.

Example

The code below presents a ServiceA having two dependencies: ServiceB and ServiceC.
The first dependency, ServiceB is declared by ServiceA on the field itself, using the annotation @Resource.
The second dependency, ServiceC is declared on the getter, indicating the will of the developer to benefit from lazy loading.

The example below illustrates in red the code or behaviour that should be injected at runtime by SCIF.

import ch.niceideas.common.service.Registry;
import ch.niceideas.common.service.Service;
import javax.annotation.Resource;

/** A Business Service  */
@Service
public class ServiceA {

    @Resource // here we inject the dependency on the field
    private ServiceB serviceB = service_injected_by_reflection;

    private ServiceC serviceC;

    public ServiceB getServiceB() { return serviceB; }
    public void setServiceB(ServiceB serviceB) { this.serviceB = serviceB; }

    @Resource // here we inject the dependency on the property (getter)
    public ServiceC getServiceC() { return serviceC; }
    public void setServiceC(ServiceC serviceC) { this.serviceC = serviceC; }

    // we want to use javassist to generate a sub-class on the fly, at runtime, to handle
    // the lazy loading of ServiceC at runtime
    public static class javassist_sub extends ServiceA {

        private Registry registry = injected_by_reflection;

        public ServiceC getServiceC() {
            ServiceC retObject = super.getServiceC();
            if (retObject == null) {
                retObject = (ServiceC) registry.getService(
                        ServiceC.class.getCanonicalName());
                super.setServiceC(retObject);
            }
            return retObject;
        }
    }
}

Now let's see what is the design of the SCIF framework enabling this behaviour.

4.2 Design

SCIF is implemented by the following fundamental classes:

  1. Registry : a Registry is a service manager. It stores services following a specific scope passed to the storeService method.
    Stored services can be retrieved regardless of the scope. They are searched in the smallest scope first and then in larger scopes.
  2. StaticRegistry : a static registry stores services in a static map.
    Only the APPLICATION scope is supported. Attempting to store a service in another scope results in an exception.
  3. RegistryInitializer : the RegistryInitializer is the most important component of SCIF.
    It is responsible for:
    • Searching the classpath for classes declaring the @Service annotation
    • Injecting dependencies in the various form supported by the IoC Framework:
      • Field injection. This is done using simply runtime reflection.
      • Method (getter) injection. This is done by generating dynamically a subclass that takes care of the Lazy Loading of the dependency.
  4. @Service : this annotation identifies services to be searched for in the classpath.
  5. @Resource : this annotation identifies dependencies to be injected either at field level or getter (property) level.

The design is as follows:


Rules regarding @Resource handling

The RegistryInitializer analyzes the classes annotated with @Service and handles @Resource annotations the following way:

  • If @Resource is declared on a field: in this case there is actually no need for bytecode manipulation. When the RegistryInitializer analyzes the services, it simply injects the reference to the field.
  • If @Resource is declared on a getter, and a corresponding setter is found: in this case, the system assumes he can use the setter to implement a cache for lazy loading. A subclass is created which overrides the getter to implement lazy loading. When calling the overriden getter, the latest will first call the original getter to see if the result service is already available. If that is the case, that service is returned. if it is not the case, the system will get the target service from the registry and use the corresponding setter to store it before returning it.
  • If @Resource is declared on arbitrary method or on a getter without a corresponding setter: in this case the system cannot assume he can use the underlying field (through the setter) as a cache, and simply returns the target service from the registry at every call.

The RegistryInitializer uses the following tools / libraries:

  • ReflectionUtils : a little helper class simplifying some operations related to Runtime Reflection.
  • Reflections (org.reflection.Reflections) : a mandatory package when it comes to analyzing the classpath looking for classes declaring a certain annotation.
  • Javassist : the bytecode manipulation library.

4.3 Some focus on code

We'll see below the most important pieces of code of the SCIF Framework.

(Reminder, I am coloring relevant Javassist API calls in dark red)

It all starts with the method RegistryInitializer.init() that takes care of the whole shebang:

/**
 * Initialize a registry from the root package (prefix) given as argument.
 * <br />
 * The RegistryInitializer will search for classes annotated with @Service and add them 
 * to the returned registry.
 *
 * @param rootPackage
 * @return A Registry containing all discovered services
 * @throws RegistryInitializationException in case of any error
 */
public static Registry init(String rootPackage) throws RegistryInitializationException {

    try {
        Reflections reflections = new Reflections(rootPackage);

        Set<Class<?>> annotated = reflections.getTypesAnnotatedWith(Service.class);

        // Create service wrappers
        Map<Class<?>, ServiceWrapper> wrappers = initServiceWrappers(annotated);

        // Build set of methods and fields to be analyzed
        analyzeWrappers(wrappers);

        // Now the complicated part, overwrite getters / methods using Javassist,
        // dynamically creating a subclass with bytecode manipulation
        enhancePropertyGetters(wrappers);

        StaticRegistry registry = new StaticRegistry();

        // Instantiate all the services and store them in Registry 
        // with class name as service name
        initializeRegistry(wrappers, registry);

        // Then do the easy thing : inject service on dependencies expressed on fields
        proceedFieldInjection(wrappers, registry);

        return registry;

    } catch (ReflectiveOperationException | NotFoundException | CannotCompileException e) {
        logger.error (e, e);
        throw new RegistryInitializationException (e.getMessage(), e);
    }
}

The really interesting call here is enhancePropertyGetters(wrappers);. This is where we use bytecode manipulation to generate the subclass dynamically and override the getter of the method declaring the @Resource annotation.
We won't present the other methods but we'll see the listing of this enhancePropertyGetters() method:

private static void enhancePropertyGetters(Map<Class<?>, ServiceWrapper> wrappers) 
        throws NotFoundException, CannotCompileException, ClassNotFoundException {
    ClassPool.doPruning = true;

    ClassPool pool = ClassPool.getDefault();
    pool.appendClassPath(new LoaderClassPath(
            Thread.currentThread().getContextClassLoader()));

    for (ServiceWrapper wrapper : wrappers.values()) {

        if (wrapper.getMethodsToEnhance().size() > 0) {

            CtClass superClazz = pool.get(wrapper.getServiceName());

            // Unfortunately I need to go with the sub-class approach
            // I cannot change the original class since it has already been loaded and
            // javassist cannot change a class that is already loaded (that would require
            // changing linking and javassist cannot do that)

            CtClass clazz = pool.makeClass(wrapper.getServiceName() + "$javassist_sub");
            clazz.stopPruning(true);

            clazz.setSuperclass(superClazz);
            clazz.setModifiers(Modifier.PUBLIC);

            ...

            // Add registry on class if not already one. The field might already have been
            // added on a parent class. If this is the case, don't add it again
            injectRegistryField(pool, clazz);

            // Proceed with method modification
            for (Method method : wrapper.getMethodsToEnhance()) {

                // Various cases :

                // 1. Method doesn't have the form of a getter
                if (!method.getName().startsWith("get")) {

                    // => Just override method so it returns the service from registry
                    overrideMethod(clazz, method);
                }

                // 2. Method is a getter
                else {

                    try {
                        Method setter = ReflectionUtils.getSetter(wrapper.getServiceClass(),
                                ReflectionUtils.getPropertyName(method),
                                method.getReturnType());

                        // 2.2 setter is found
                        // => Lazy loading : use underlying field as cache.
                        // If it it set, do nothing.
                        // If is is null, look for service and attach it
                        // Needs to override the getter for doing my business and
                        // delegate to the setter for setting the field
                        overrideGetter(clazz, method, setter);

                    } catch (NoSuchMethodException e) {

                        logger.debug (e, e);

                        // 2.1 No setter could be found
                        // => Just rewrite method so it returns the service 
                        // from the registry
                        overrideMethod(clazz, method);
                    }
                }
            }
            
            ...

            // make new subclass available to classloader
            pool.toClass(clazz, Thread.currentThread().getContextClassLoader(), null);
            clazz.stopPruning(false);

            // use the new subclass instead of the original class from now on
            Class<?> subClazz = Class.forName(clazz.getName());
            wrapper.overrideClass(subClazz);
        }
    }
}

In the code above, the interesting calls are overrideGetter(), overrideMethod() and injectRegistryField() since these are the methods where bytecode manipulation occurs.
Let's look at these methods:

private static void overrideGetter(CtClass clazz, Method getter, Method setter) 
        throws CannotCompileException {
    String targetService = getter.getReturnType().getCanonicalName();
    CtMethod newMethod = CtNewMethod.make(
                "public " + targetService + " " + getter.getName() + "() { \n" +
                "" +
                "    " + targetService + " retObject =  super." + getter.getName() + "(); "+
                "    if (retObject == null) {" +
                "         retObject =  (" + targetService + ") \n" + 
                "                 getRegistry().getService(\"" + targetService + "\"); " +
                "         super." + setter.getName() + "(retObject);" +
                "    }" +
                "    return retObject;" +
                "" +
                "}",
            clazz);
    clazz.addMethod(newMethod);
}

private static void overrideMethod(CtClass clazz, Method method) 
        throws CannotCompileException {
    String targetService = method.getReturnType().getCanonicalName();
    CtMethod newMethod = CtNewMethod.make(
                "public " + targetService + " " + method.getName() + "() { \n" +
                "" +
                "    " + targetService + " retObject =  \n" + 
                "            (" + targetService + ")\n" + 
                "            getRegistry().getService(\"" + targetService + "\"); " +
                "    return retObject;" +
                "" +
                "}",
            clazz);
    clazz.addMethod(newMethod);
}

/**
 * Inject a field to store the registry in the target clazz as well as a getter 
 * to retrieve that registry.
 *
 * @param pool the Javassist ClassPool to be used
 * @param clazz the class to be modified this way
 * @throws NotFoundException
 * @throws CannotCompileException
 */
public static void injectRegistryField(ClassPool pool, CtClass clazz) 
        throws NotFoundException, CannotCompileException {
    CtField registryField = null;
    try {
        registryField = clazz.getField("registry");
    } catch (NotFoundException e) {
        // ignored
    }
    if (registryField == null) {
        CtClass registryClass = pool.get(Registry.class.getName());
        registryField = new CtField(registryClass, "registry", clazz);
        registryField.setModifiers(Modifier.setPrivate(Modifier.STATIC));
        clazz.addField(registryField, "null");

        CtMethod registryGetter = CtNewMethod.getter("getRegistry", registryField);
        registryGetter.setModifiers(Modifier.PUBLIC);
        clazz.addMethod(registryGetter);

        CtMethod registrySetter = CtNewMethod.make(
                    "public static void setRegistry (" + Registry.class.getName() + 
                    "            holder) { " +
                    "    registry = holder; " +
                    "} ",
                clazz);
        clazz.addMethod(registrySetter);
    }
}

We've seen the most important pieces of code of the SCIF Framework above.
The framework itself is available for download in the next section.

4.4 DemoApp : Comet Tennis

I integrated the SCIF framework in the comet-tennis demo application. It's a small application I wrote initially here : comet-tennis.
This application uses a few services and the idea here is to use the SCIF framework to manage these services and inject their dependencies.

The new package with the SCIF framework integrated is available here.

5. Conclusion

Bytecode manipulation is a lot of fun and opens a whole new world of possibilities on the JVM. It's the only way to implement advanced tooling such as IoC Containers, ORM frameworks, boilerplate code generators, etc.
Normally, bytecode manipulation is something rather pretty difficult to achieve ... except with Javassist.
Javassist makes bytecode manipulation so easy and straightforward. The ability to write dynamically in simple strings actual java source code and add it on the fly as bytecode to classes being manipulated is striking. Javassist is in my opinion the simplest way to perform bytecode manipulation in Java.

I covered above some use cases for bytecode manipulation, there are many others, for instance tampering with licence checking systems of non-free software (Hush. I said nothing)

In my career, I have encountered many situations where I wish afterwards I have known Javassist since it would have been pretty helpful. Let me mention two:

  1. Some 15 years ago, I was working on a pretty big J2EE Websphere application with a lot of EJBS. Tracking the user flow in the distributed system was a nightmare due to the complexity of the business processes and the business rules, so we ended up adding logging information each and every time a business method was entered and left, such as log.debug ("ENTER - business method") and log.debug ("LEAVE - business method").
    In regards to troubleshooting, this may sound stupid but it ended up being not only pretty convenient but rather really our single and only way to figure what was going on in some situations in such a enormous software.
    Adding these two lines of code (plus a few try { ... } finally { ... } statements to make sure the leaving trace was always output made us add thousands of lines of code to the application ... which could have been replaced by a few lines of code java agent and some javassist magic.
  2. Some 10 years ago, I was working for a banking institution on a big Java application making an extensive use of Hibernate. The problem there is that we were trying to map a nice and meaningful business model to a legacy data model. With a lot of hibernate tricks we pretty much succeeded in achieving the mapping, using a lot of custom and pretty tricky code in hibernate session listeners to handle the relationships that hibernate as not able to handle natively.
    There as well, we ended up writing thousands of lines of specific glue code in hibernate listeners which we could have replaced by a pretty simple Javassist base framework to complement the missing features of hibernate.

You might want to have a look at the second article in this serie available here : Bytecode manipulation with Javassist for fun and profit part II: Generating toString and getter/setters using bytecode manipulation.

Part of this article is available as a slideshare presentation here: https://www.slideshare.net/JrmeKehrli/bytecode-manipulation-with-javassist-for-fun-and-profit.



Comments:

Thanks for the article, good stuff. About that picture in the beginning. I would say that bytecode verification goes before classloading. At least, I hope it's true. And by the way, setAccessible() may not work well with Java 9.

Posted by Artem on March 07, 2017 at 07:36 AM CET #


thank you sir,

Posted by a3rpHa on March 15, 2017 at 01:07 AM CET #


Hey dude, nicely written! Though a little bit too long, a little bit too informative :)

Posted by Nathanael Yang on January 14, 2019 at 08:19 PM CET #


Leave a Comment

HTML Syntax: Allowed