In this blog we will create a simple unmarshaller at deploy-time using bytecode and measure its performance.
This blog is part 4 of 4, the series consists of the following posts:
- Writing an unmarshaller by hand
- Constructing an unmarshaller using reflection
- Generating an unmarshaller using annotations
- Creating an unmarshaller using bytecode
For a definition of the some of the terminology used here see Phases for generating code.
In part 1 we coded a simple unmarshalled by hand. The obvious disadvantage of this is that, well, you have to code by hand. In part 2 we constructed an unmarshaller automatically using reflection. The initial implemetentation was over 5 times slower, but with some optimization we managed ‘only’ 20% slower. In part 3 we generated an unmarshaller using an annotation processor. This was as fast as coding by hand, but required access to the source code and a JDK.
In this post we will look at a way to create an unmarshaller which is as fast as coding by hand, can be created automatically, without need for source code or JDK.
Java source code gets compiled to bytecode which is stored in .class files. In order to generate code, which is as fast as compiled code, without a compiler we will need to write bytecode ourselves. This may seems like a daunting task, so let’s break it down. First let’s figure out what bytecode needs to be generated. And worry about how to generate it afterwards.
Easiest way to figure out what source code to generate is to look at what the compiler does. So given this abstract class1:
public abstract class EmployeeUnmarshaller {
public abstract Employee read(Parser parser) throws Throwable;
}
Let’s implement the code we want manually:
public class TestUnmarshaller extends EmployeeUnmarshaller {
@Override
public Employee read(Parser parser) throws Throwable {
var employee = new Employee();
employee.setId(parser.readInteger());
employee.setActive(parser.readBoolean());
employee.setFirstName(parser.readString());
employee.setLastName(parser.readString());
employee.setStartYear(parser.readInteger());
employee.setJobTitle(parser.readString());
return employee;
}
}
Compile it and look at the output. The JDK comes with a very handy tool, called javap, which can be run with option ‘-v’ and a class file as argument and will print the content of the class file in a human readable format2.
Running javap gives the following output3:
Classfile /generation-benchmark/code-generation/target/classes/dev/sanjuroe/generation/deploytime/TestUnmarshaller.class
Last modified 15 mei 2021; size 788 bytes
MD5 checksum d168aa78ad437efd15a2c4e013ee35ac
public class dev.sanjuroe.generation.deploytime.TestUnmarshaller extends dev.sanjuroe.generation.deploytime.EmployeeUnmarshaller
minor version: 0
major version: 55
flags: (0x0021) ACC_PUBLIC, ACC_SUPER
this_class: #13 // dev/sanjuroe/generation/deploytime/TestUnmarshaller
super_class: #14 // dev/sanjuroe/generation/deploytime/EmployeeUnmarshaller
interfaces: 0, fields: 0, methods: 2, attributes: 0
Constant pool:
...
{
public dev.sanjuroe.generation.deploytime.TestUnmarshaller();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method dev/sanjuroe/generation/deploytime/EmployeeUnmarshaller."<init>":()V
4: return
public dev.sanjuroe.generation.Employee read(dev.sanjuroe.generation.Parser) throws java.lang.Throwable;
descriptor: (Ldev/sanjuroe/generation/Parser;)Ldev/sanjuroe/generation/Employee;
flags: (0x0001) ACC_PUBLIC
Code:
stack=2, locals=3, args_size=2
0: new #2 // class dev/sanjuroe/generation/Employee
3: dup
4: invokespecial #3 // Method dev/sanjuroe/generation/Employee."<init>":()V
7: astore_2
8: aload_2
9: aload_1
10: invokeinterface #4, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
15: invokevirtual #5 // Method dev/sanjuroe/generation/Employee.setId:(I)V
18: aload_2
19: aload_1
20: invokeinterface #6, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readBoolean:()Z
25: invokevirtual #7 // Method dev/sanjuroe/generation/Employee.setActive:(Z)V
28: aload_2
29: aload_1
30: invokeinterface #8, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readString:()Ljava/lang/String;
35: invokevirtual #9 // Method dev/sanjuroe/generation/Employee.setFirstName:(Ljava/lang/String;)V
38: aload_2
39: aload_1
40: invokeinterface #8, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readString:()Ljava/lang/String;
45: invokevirtual #10 // Method dev/sanjuroe/generation/Employee.setLastName:(Ljava/lang/String;)V
48: aload_2
49: aload_1
50: invokeinterface #4, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
55: invokevirtual #11 // Method dev/sanjuroe/generation/Employee.setStartYear:(I)V
58: aload_2
59: aload_1
60: invokeinterface #8, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readString:()Ljava/lang/String;
65: invokevirtual #12 // Method dev/sanjuroe/generation/Employee.setJobTitle:(Ljava/lang/String;)V
68: aload_2
69: areturn
Exceptions:
throws java.lang.Throwable
}
Now that’s a lot to take in! But let’s focus on the bottom part where the read method is shown. The alternating calls to the parser and then to the setters on Employee are clearly visibile. But if you haven’t seen bytecode before the rest probably looks very alien. Don’t worry, with just two pieces of additional information you will be able to start making sense of all this.
First of all Java bytecode is intended to run on a stack machine. This means that all data or operands are pushed onto the stack, instructions pop their input from the stack and push the result back onto the same stack. So, using Reverse Polish Notation, the following expression
1 2 3 MUL ADD
pushes 1, 2, and 3 (in that order) onto the stack. Then the MUL instruction takes the top 2 elements (being 2 and 3), multiplies them, and puts the result back on the stack, leaving
1 6 ADD
then the ADD operation then takes the top 2 elements again (being 1 and 6), adds them up, and pushes the result back on the stack, leaving
7
So we just calculated 1 + (2 × 3) = 7. In Java bytecode this would look somethink like4:
iconst_1
iconst_2
iconst_3
imul
iadd
the ‘i’ prefix indicating the operands are integers. Adding an integer return
ireturn
will return the top element of the stack (being 7) as the result of the method.
The second thing to know is that this method declaration
public abstract Employee read(Parser parser)
is actually short hand for
public abstract Employee read(EmployeeUnmarshaller this, Parser parser)
so the first argument to every non-static method actually is this5. In order to invoke a method in bytecode we need to make sure the stack contains all its arguments, including this. So the following bytecode:
aload_2
iconst_1
invokevirtual #5 // Method dev/sanjuroe/generation/Employee.setId:(I)V
will load a reference to an Empoyee object from local variable 2 and pushes it onto the stack, then it will push the integer 1 into the stack, and finally call Employee::setId. So this will set the id of the Employee object to 1.
Now we know all this, let’s look at the repeating part of the bytecode:
aload_2
aload_1
invokeinterface #4, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
invokevirtual #5 // Method dev/sanjuroe/generation/Employee.setId:(I)V
this can be interpretted as
Employee@2 Parser@1 Parser::readInteger Employee::setId
so push a reference to Employee from local variable 2 onto the stack, then push a reference to Parser from local variable 1 onto the stack, then execute Parser::readInteger. Parser::readInteger takes 1 argument (remember this) so it takes the reference from the top of the stack, calling readInteger onto the object, assuming the method call returns 42, it will then push 42 back onto the stack leaving
Employee@2 42 Employee::setId
which will call setId onto the Employee object with 42 as argument.
But before all this we still need to actually create a new Employee object, which is done by the instruction new:
new #2 // class dev/sanjuroe/generation/Employee
and call the constructor
invokespecial #3 // Method dev/sanjuroe/generation/Employee."<init>":()V
Unfortunately calling the constructor consumes the reference from the stack, but doesn’t put it back, leaving us with an initialized object, but no way to reference it. The solution is to duplicate the reference before calling the constructor:
new #2 // class dev/sanjuroe/generation/Employee
dup
invokespecial #3 // Method dev/sanjuroe/generation/Employee."<init>":()V
Now that we have an idea of what bytecode needs to be generated we can move onto the how to generate it.
Manually generating a class file is hard. Very hard. Nowadays even the JDK uses an external library to do this, called ASM. It can be simply included as a dependency using Maven.
<dependency>
<groupId>org.ow2.asm</groupId>
<artifactId>asm</artifactId>
<version>9.1</version>
</dependency>
ASM actually has 2 APIs, the basic Core API and the more complex Tree API. For our purposes the Core API is more than sufficient. Besides writing class files, ASM can also read and transform them. In order to facilitate this the developers behind ASM have chosen to implement the ClassWriter using the Visitor pattern to allow it to be passed to a ClassReader. Since we are only interested in writing, we will be calling the visit methods ourselves.
Recalling the following piece of bytecode:
aload_2
aload_1
invokeinterface #4, 1 // InterfaceMethod dev/sanjuroe/generation/Parser.readInteger:()I
invokevirtual #5 // Method dev/sanjuroe/generation/Employee.setId:(I)V
this can be generated using:
mv.visitVarInsn(Opcodes.ALOAD, 2);
mv.visitVarInsn(Opcodes.ALOAD, 1);
mv.visitMethodInsn(
Opcodes.INVOKEINTERFACE,
"dev/sanjuroe/generation/Parser",
"readInteger",
"()I,
true
);
mv.visitMethodInsn(
Opcodes.INVOKEVIRTUAL,
"dev/sanjuroe/generation/Employee",
"setId",
"(I)V,
false
);
The rest of the bytecode can be mapped quiet easily to ASM method calls using their JavaDoc. See the full code on GitHub for all details.
What is left is figuring out which methods to call. For that we can borrow the code from the part on reflection:
Class<?> clazz = Employee.class;
var fields = clazz.getDeclaredFields();
for (Field field : fields) {
var fieldName = field.getName();
var fieldType = field.getType();
appendField(mv, fieldName, fieldType);
}
Then all we need is some class loading magic:
public static Class<?> loadClass(String className, byte[] ba, ClassLoader parent) throws ClassNotFoundException {
var loader = new ClassLoader(parent) {
@Override
protected Class<?> findClass(String name) throws ClassNotFoundException {
if (className.equals(name)) {
return defineClass(name, ba, 0, ba.length);
}
throw new ClassNotFoundException();
}
};
Class<?> asmClass;
asmClass = loader.loadClass(className);
return asmClass;
}
And we managed to parse a byte array into an Employee object using custom generated bytecode.
Running JMH benchmark on all different methods, on my machine, gives the following results:
Benchmark Mode Cnt Score Error Units
d.s.g.codetime.CodeTimeBenchmark.benchmark thrpt 25 6158,231 ± 40,101 ops/ms
d.s.g.compiletime.CompileTimeBenchmark.benchmark thrpt 25 6190,825 ± 34,077 ops/ms
d.s.g.deploytime.DeployTimeBenchmark.benchmark thrpt 25 6115,819 ± 12,131 ops/ms
d.s.g.runtime.RunTimeBenchmark.basic thrpt 25 1182,603 ± 13,976 ops/ms
d.s.g.runtime.RunTimeBenchmark.deploy thrpt 25 4336,421 ± 35,638 ops/ms
d.s.g.runtime.RunTimeBenchmark.handle thrpt 25 4946,743 ± 31,497 ops/ms
Which shows the bytecode generation (DeployTimeBenchmark) is as fast as the hand-coded and annotation processor generated code. We set out to automatically create code, without using a compiler, and which is as fast as manually coding, and we managed to do exactly that.
In this series we looked at 4 different ways of creating an unmarshaller. In increasing order of complexity. Which one should you use? Well, to paraphrase Einstein, use the simplest method which suites your needs, but not simpler. For throw-away prototypes I recommend just hand-coding. If throughput is not a concern, go ahead and use reflection. For in-house projects, using annotation processors is usually fine. When writing a library, which may also be used by external parties, however, bytecode generation is usually the way to go.
-
I am using an abstract class instead of reusing the generic Unmarshaller interface to side step bridge methods, since these do add complexity, but don’t contribute to the concepts I am trying to convey here ↩
-
Most IDEs allow for defining external tools which can than be easily run, in IntelliJ IDEA for example go to Settings -> Tools -> External Tools, then add a new tool, name it javap, enter as program ‘javap’ (without the quotes) and set arguments to ‘-v $FilePath$’ (again without the quotes), now save, find a class file, right click and External Tools -> javap ↩
-
I left out the constant pool for clarify, depending on which version of the Java compiler you use and whether debugging information is turned on, your output might look slighly different ↩
-
The Java compiler will probably never generate this bytecode, because it will look at it, conclude that is always evaluates to 7, and just replace it with a hardcoded ‘7’ ↩
-
Go ahead and try adding a this argument, you can add it to any non-static method! It is known as the receiver parameter and was added in Java 8 to allow placing annotations on it ↩