Creating an unmarshaller using bytecode

In this blog we will create a simple unmarshaller at deploy-time using bytecode and measure its performance.

This blog is part 4 of 4, the series consists of the following posts:

Writing an unmarshaller by hand
Constructing an unmarshaller using reflection
Generating an unmarshaller using annotations
Creating an unmarshaller using bytecode

For a definition of the some of the terminology used here see Phases for generating code.

In part 1 we coded a simple unmarshalled by hand. The obvious disadvantage of this is that, well, you have to code by hand. In part 2 we constructed an unmarshaller automatically using reflection. The initial implemetentation was over 5 times slower, but with some optimization we managed ‘only’ 20% slower. In part 3 we generated an unmarshaller using an annotation processor. This was as fast as coding by hand, but required access to the source code and a JDK.

In this post we will look at a way to create an unmarshaller which is as fast as coding by hand, can be created automatically, without need for source code or JDK.

Java source code gets compiled to bytecode which is stored in .class files. In order to generate code, which is as fast as compiled code, without a compiler we will need to write bytecode ourselves. This may seems like a daunting task, so let’s break it down. First let’s figure out what bytecode needs to be generated. And worry about how to generate it afterwards.

Easiest way to figure out what source code to generate is to look at what the compiler does. So given this abstract class¹:

public abstract class EmployeeUnmarshaller {

    public abstract Employee read(Parser parser) throws Throwable;
}

Let’s implement the code we want manually:

public class TestUnmarshaller extends EmployeeUnmarshaller {

    @Override
    public Employee read(Parser parser) throws Throwable {
        var employee = new Employee();

        employee.setId(parser.readInteger());
        employee.setActive(parser.readBoolean());
        employee.setFirstName(parser.readString());
        employee.setLastName(parser.readString());
        employee.setStartYear(parser.readInteger());
        employee.setJobTitle(parser.readString());

        return employee;
    }
}

Compile it and look at the output. The JDK comes with a very handy tool, called javap, which can be run with option ‘-v’ and a class file as argument and will print the content of the class file in a human readable format².

Running javap gives the following output³:

Classfile /generation-benchmark/code-generation/target/classes/dev/sanjuroe/generation/deploytime/TestUnmarshaller.class
  Last modified 15 mei 2021; size 788 bytes
  MD5 checksum d168aa78ad437efd15a2c4e013ee35ac
public class dev.sanjuroe.generation.deploytime.TestUnmarshaller extends dev.sanjuroe.generation.deploytime.EmployeeUnmarshaller
  minor version: 0
  major version: 55
  flags: (0x0021) ACC_PUBLIC, ACC_SUPER
  this_class: #13                         // dev/sanjuroe/generation/deploytime/TestUnmarshaller
  super_class: #14                        // dev/sanjuroe/generation/deploytime/EmployeeUnmarshaller
  interfaces: 0, fields: 0, methods: 2, attributes: 0
Constant pool:
  ...
{
  public dev.sanjuroe.generation.deploytime.TestUnmarshaller();
    descriptor: ()V
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method dev/sanjuroe/generation/deploytime/EmployeeUnmarshaller."<init>":()V
         4: return

  public dev.sanjuroe.generation.Employee read(dev.sanjuroe.generation.Parser) throws java.lang.Throwable;
    descriptor: (Ldev/sanjuroe/generation/Parser;)Ldev/sanjuroe/generation/Employee;
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=2, locals=3, args_size=2
         0: new           #2                  // class dev/sanjuroe/generation/Employee
         3: dup
         4: invokespecial #3                  // Method dev/sanjuroe/generation/Employee."<init>":()V
         7: astore_2
         8: aload_2
         9: aload_1
        10: invokeinterface #4,  1            // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
        15: invokevirtual #5                  // Method dev/sanjuroe/generation/Employee.setId:(I)V
        18: aload_2
        19: aload_1
        20: invokeinterface #6,  1            // InterfaceMethod dev/sanjuroe/generation/Parser.readBoolean:()Z
        25: invokevirtual #7                  // Method dev/sanjuroe/generation/Employee.setActive:(Z)V
        28: aload_2
        29: aload_1
        30: invokeinterface #8,  1            // InterfaceMethod dev/sanjuroe/generation/Parser.readString:()Ljava/lang/String;
        35: invokevirtual #9                  // Method dev/sanjuroe/generation/Employee.setFirstName:(Ljava/lang/String;)V
        38: aload_2
        39: aload_1
        40: invokeinterface #8,  1            // InterfaceMethod dev/sanjuroe/generation/Parser.readString:()Ljava/lang/String;
        45: invokevirtual #10                 // Method dev/sanjuroe/generation/Employee.setLastName:(Ljava/lang/String;)V
        48: aload_2
        49: aload_1
        50: invokeinterface #4,  1            // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
        55: invokevirtual #11                 // Method dev/sanjuroe/generation/Employee.setStartYear:(I)V
        58: aload_2
        59: aload_1
        60: invokeinterface #8,  1            // InterfaceMethod dev/sanjuroe/generation/Parser.readString:()Ljava/lang/String;
        65: invokevirtual #12                 // Method dev/sanjuroe/generation/Employee.setJobTitle:(Ljava/lang/String;)V
        68: aload_2
        69: areturn
    Exceptions:
      throws java.lang.Throwable
}

Now that’s a lot to take in! But let’s focus on the bottom part where the read method is shown. The alternating calls to the parser and then to the setters on Employee are clearly visibile. But if you haven’t seen bytecode before the rest probably looks very alien. Don’t worry, with just two pieces of additional information you will be able to start making sense of all this.

First of all Java bytecode is intended to run on a stack machine. This means that all data or operands are pushed onto the stack, instructions pop their input from the stack and push the result back onto the same stack. So, using Reverse Polish Notation, the following expression

1 2 3 MUL ADD

pushes 1, 2, and 3 (in that order) onto the stack. Then the MUL instruction takes the top 2 elements (being 2 and 3), multiplies them, and puts the result back on the stack, leaving

1 6 ADD

then the ADD operation then takes the top 2 elements again (being 1 and 6), adds them up, and pushes the result back on the stack, leaving

So we just calculated 1 + (2 × 3) = 7. In Java bytecode this would look somethink like⁴:

iconst_1
iconst_2
iconst_3
imul
iadd

the ‘i’ prefix indicating the operands are integers. Adding an integer return

ireturn

will return the top element of the stack (being 7) as the result of the method.

The second thing to know is that this method declaration

public abstract Employee read(Parser parser)

is actually short hand for

public abstract Employee read(EmployeeUnmarshaller this, Parser parser)

so the first argument to every non-static method actually is this⁵. In order to invoke a method in bytecode we need to make sure the stack contains all its arguments, including this. So the following bytecode:

aload_2
iconst_1
invokevirtual #5          // Method dev/sanjuroe/generation/Employee.setId:(I)V

will load a reference to an Empoyee object from local variable 2 and pushes it onto the stack, then it will push the integer 1 into the stack, and finally call Employee::setId. So this will set the id of the Employee object to 1.

Now we know all this, let’s look at the repeating part of the bytecode:

aload_2
aload_1
invokeinterface #4,  1    // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
invokevirtual #5          // Method dev/sanjuroe/generation/Employee.setId:(I)V

this can be interpretted as

Employee@2 Parser@1 Parser::readInteger Employee::setId

so push a reference to Employee from local variable 2 onto the stack, then push a reference to Parser from local variable 1 onto the stack, then execute Parser::readInteger. Parser::readInteger takes 1 argument (remember this) so it takes the reference from the top of the stack, calling readInteger onto the object, assuming the method call returns 42, it will then push 42 back onto the stack leaving

Employee@2 42 Employee::setId

which will call setId onto the Employee object with 42 as argument.

But before all this we still need to actually create a new Employee object, which is done by the instruction new:

new           #2          // class dev/sanjuroe/generation/Employee

and call the constructor

invokespecial #3          // Method dev/sanjuroe/generation/Employee."<init>":()V

Unfortunately calling the constructor consumes the reference from the stack, but doesn’t put it back, leaving us with an initialized object, but no way to reference it. The solution is to duplicate the reference before calling the constructor:

new           #2          // class dev/sanjuroe/generation/Employee
dup
invokespecial #3          // Method dev/sanjuroe/generation/Employee."<init>":()V

Now that we have an idea of what bytecode needs to be generated we can move onto the how to generate it.

Manually generating a class file is hard. Very hard. Nowadays even the JDK uses an external library to do this, called ASM. It can be simply included as a dependency using Maven.

<dependency>
	<groupId>org.ow2.asm</groupId>
	<artifactId>asm</artifactId>
	<version>9.1</version>
</dependency>

ASM actually has 2 APIs, the basic Core API and the more complex Tree API. For our purposes the Core API is more than sufficient. Besides writing class files, ASM can also read and transform them. In order to facilitate this the developers behind ASM have chosen to implement the ClassWriter using the Visitor pattern to allow it to be passed to a ClassReader. Since we are only interested in writing, we will be calling the visit methods ourselves.

Recalling the following piece of bytecode:

aload_2
aload_1
invokeinterface #4,  1    // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
invokevirtual #5          // Method dev/sanjuroe/generation/Employee.setId:(I)V

this can be generated using:

mv.visitVarInsn(Opcodes.ALOAD, 2);
mv.visitVarInsn(Opcodes.ALOAD, 1);
mv.visitMethodInsn(
	Opcodes.INVOKEINTERFACE,
	"dev/sanjuroe/generation/Parser",
	"readInteger",
	"()I,
	true
);
mv.visitMethodInsn(
	Opcodes.INVOKEVIRTUAL,
	"dev/sanjuroe/generation/Employee",
	"setId",
	"(I)V,
	false
);

The rest of the bytecode can be mapped quiet easily to ASM method calls using their JavaDoc. See the full code on GitHub for all details.

What is left is figuring out which methods to call. For that we can borrow the code from the part on reflection:

Class<?> clazz = Employee.class;
var fields = clazz.getDeclaredFields();
for (Field field : fields) {
	var fieldName = field.getName();
	var fieldType = field.getType();
	appendField(mv, fieldName, fieldType);
}

Then all we need is some class loading magic:

public static Class<?> loadClass(String className, byte[] ba, ClassLoader parent) throws ClassNotFoundException {
	var loader = new ClassLoader(parent) {
		@Override
		protected Class<?> findClass(String name) throws ClassNotFoundException {
			if (className.equals(name)) {
				return defineClass(name, ba, 0, ba.length);
			}
			throw new ClassNotFoundException();
		}
	};

	Class<?> asmClass;
	asmClass = loader.loadClass(className);
	return asmClass;
}

And we managed to parse a byte array into an Employee object using custom generated bytecode.

Running JMH benchmark on all different methods, on my machine, gives the following results:

Benchmark                                          Mode  Cnt     Score    Error   Units
d.s.g.codetime.CodeTimeBenchmark.benchmark        thrpt   25  6158,231 ± 40,101  ops/ms
d.s.g.compiletime.CompileTimeBenchmark.benchmark  thrpt   25  6190,825 ± 34,077  ops/ms
d.s.g.deploytime.DeployTimeBenchmark.benchmark    thrpt   25  6115,819 ± 12,131  ops/ms
d.s.g.runtime.RunTimeBenchmark.basic              thrpt   25  1182,603 ± 13,976  ops/ms
d.s.g.runtime.RunTimeBenchmark.deploy             thrpt   25  4336,421 ± 35,638  ops/ms
d.s.g.runtime.RunTimeBenchmark.handle             thrpt   25  4946,743 ± 31,497  ops/ms

Which shows the bytecode generation (DeployTimeBenchmark) is as fast as the hand-coded and annotation processor generated code. We set out to automatically create code, without using a compiler, and which is as fast as manually coding, and we managed to do exactly that.

In this series we looked at 4 different ways of creating an unmarshaller. In increasing order of complexity. Which one should you use? Well, to paraphrase Einstein, use the simplest method which suites your needs, but not simpler. For throw-away prototypes I recommend just hand-coding. If throughput is not a concern, go ahead and use reflection. For in-house projects, using annotation processors is usually fine. When writing a library, which may also be used by external parties, however, bytecode generation is usually the way to go.

I am using an abstract class instead of reusing the generic Unmarshaller interface to side step bridge methods, since these do add complexity, but don’t contribute to the concepts I am trying to convey here ↩
Most IDEs allow for defining external tools which can than be easily run, in IntelliJ IDEA for example go to Settings -> Tools -> External Tools, then add a new tool, name it javap, enter as program ‘javap’ (without the quotes) and set arguments to ‘-v $FilePath$’ (again without the quotes), now save, find a class file, right click and External Tools -> javap ↩
I left out the constant pool for clarify, depending on which version of the Java compiler you use and whether debugging information is turned on, your output might look slighly different ↩
The Java compiler will probably never generate this bytecode, because it will look at it, conclude that is always evaluates to 7, and just replace it with a hardcoded ‘7’ ↩
Go ahead and try adding a this argument, you can add it to any non-static method! It is known as the receiver parameter and was added in Java 8 to allow placing annotations on it ↩