Just-in-time compilation in Java

on waitingforcode.com

Just-in-time compilation in Java

You're still doing Java/C#/JavaScript/Python/PHP... and need a wind of change? I was like that 4 years ago. I changed then to the data engineering field and it solved my existential problems :) If you want to follow my path, I prepared a course that will help you with that! Join the class!
Like some other programming languages, Java is also commonly called "compiled language". However sometimes you may be confused when somebody tells you that Java is JIT compiled.

This article will explain a concept of JIT compilation. At the first part we'll describe the different types of compilation. The second part will describe JIT compilation. Next to it, we'll discover Java-specific aspects of just-in-time compilation.

Compilation types

Before talking about compilation types, we need to understand what is compilation. This is a process consisting on translating programming language to language understandable by the machine (also called machine code). Machine language is composed by the instructions executed by CPU. This language is constructed with famous 0-1 numbers, as in this snippet found on wikibooks page:

0001 00000111
0100 00001001
0000 00011110

Just-in-time compilation

If you remember well, Java's javac instruction doesn't generate machine code but something called bytecode. And it's not the only language doing this. Another languages are ActionScript (executed by ActionScript Virtual Machine) or CIL (used by C# and executed on Common Language Runtime).

It's here, in "executed on" part in our parenthesis, where Just-in-time compilations arrives. This special type of compilation occurs on machines interpreting given bytecode, as ActionScript Virtual Machine or well known Java Virtual Machine (JVM). The bytecode is compiled by them just in time, on runtime, to machine code.

This type of compilation brings some benefits. The first significant advantage is the optimization of compiled code to running machine parameters. Static compilers generate machine code once and optimize it to compilation machine. On the other hand, JIT compilers provide a kind of intermediate code which is converted and optimized to machine code specific for execution machine.

The second advantage is the portability. Code translated to bytecode can be moved to any computer having virtual machine installed.

Just-in-time compilation in Java

So, Java is compiled just-in-time to machine code. To inspect the compilation to machine code, we can enable several JVM parameters:

  • -XX:+PrintCompilation

    Thanks to this parameter we can enable the output of methods compilation result. A sample output of it is:

    71    1             java.lang.String::indexOf (70 bytes)
    73    2             sun.nio.cs.UTF_8$Encoder::encode (361 bytes)
    87    3             java.lang.String::hashCode (55 bytes)

    The output is formatted to columns where the first column (for example 71) is a timestamp. The second one returns the unique compiler task ID (1, 2, 3...). After that we can see the compiled method. In the parenthesis are specified the bytes of compiled bytecode. We can see that indexOf method's weight is 70 bytes, encode 361 bytes and so on.

  • -XX:+UnlockDiagnosticVMOptions

    This is a simple flag that enables supplementary options for JVM diagnose.

  • -XX:+PrintInlining

    Thanks to this configuration we can see the details of compilation effort to inlining methods. Inlining is the way adopted by the compiler to optimize the work of compiled code. Imagine that you have following method:

    public void testMethod() {

    With inlining, callAnotherMethod(); will be replaced by the content of callAnotherMethod. Thanks to it in runtime, machine won't jump from one method to another and will able to execute the code in "inline way". The JIT makes this operation to avoid complex situations with putting parameters on the stack. When we run the code with this parameter enabled, we can see similar result:

    75    1             java.lang.String::indexOf (70 bytes)
    77    2             sun.nio.cs.UTF_8$Encoder::encode (361 bytes)
                        @ 66   java.lang.String::indexOfSupplementary (71 bytes)   too big
                        @ 14   java.lang.Math::min (11 bytes)   (intrinsic)
                        @ 139   java.lang.Character::isSurrogate (18 bytes)   never executed
    89    3             java.lang.String::hashCode (55 bytes)

Let's go back to some theoretical aspects. JIT compilation in Java can be:
- lazy: only really used methods (invoked at runtime) will be compiled to machine code.
- adaptive: whole program is compiled to some dirty machine code. This code is optimized further only for very frequently used methods.

Already translated bytecode is stored into code cache. This is a structure where are located all translated methods. When given method is called again, it's not translated from scratch, but loaded from code cache. However, cached methods can be overridden by the compiler when it thinks that this method can be optimized better. Among optimization techniques we can distinguish:
- inlining: described in the previous points, allows to avoid methods jumping.
- dead code: when some objects are present in the bytecode and are not used, compiler can decide to remove them from machine code.
- loop optimization: compiler can organize loop execution order or combine multiple loops in smallest groups to optimize the code executed by CPU.
- replace interface methods with real methods: when one method of given interface is implemented by only one object, the compiler can decide do use implemented method directly to avoid the overhead caused by the binding of really implemented method at runtime.

In this article we discovered just-in-time compilation which translated language-specific compilation code (as bytecode for Java) to the language understandable by the CPU (machine code). Java makes it thanks to compiler integrated in the JVM. And the compiler doesn't make a simple compilation because it makes also some optimizations to compiled code. Thanks to these optimizations, the machine code is adapted as most as possible to target machine.

Share on: