Reading Java methods in bytecode

Last time we discovered the very basic elements of Java's bytecode, as methods or fields representations. This time we will treat more advanced subject, methods in general.

This article will start by real-life example where all mandatory information will be stored as comments in bytecode output. After that we'll write a summary list with all viewed bytecode instructions.

Example of bytecode methods

Before we start to talk about bytecode, let's take a look on our sample class:

/**
 * Bytecode sample class which helps to illustrate following concepts in bytecode:
 * - assigning value from method signature to local class field
 * - looping with foreach
 * - representing arrays
 * - invoking static methods
 * - constants pool
 * - representation of overriden methods
 * - invoking external methods
 *
 */
public class BytecodeMethodSample {

  private String userName;

  public void externalInvoker(String[] names) {
    for (String name : names) {
      this.invokedNamer(name);
    }
  }

  private void invokedNamer(String name) {
    System.out.println("Name is: "+name);
    this.userName = name;
    this.userName = this.userName.substring(1);
  }

  @Override
  public String toString() {
    return "Overriden String value";
  }

}

Now, take a look on output generated with javap -c -p BytecodeMethodSample.class:

public class com.waitingforcode.com.waitingforcode.bytecode.BytecodeMethodSample {
  private java.lang.String userName;

  ## This is class empty constructor. The most significant line is the 1st one (1:) 
  ## where we can observe the call of parent constructor. And because our class 
  ## extends only java.lang.Object, it's the constructor of this class which is invoked.

  public com.waitingforcode.com.waitingforcode.bytecode.BytecodeMethodSample();
    Code:
       0: aload_0       
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return        

  ## This method takes in parameter array of String. The first aload and astore instructions
  ## are responsible for loading and storing variables. The third line, arraylength, is proper
  ## to arrays and it gets the length of array. This value is after stored as integer (istore). 
  ## 
  ## iconst_0 loads the integer constant with the value 0 to stack. It'll be used as a index
  ## counter in foreach loop. Next, it's stored in 3rd variable and loaded. In this way, we 
  ## meet if_icmpge. It's conditional one which decides if given (iteration in our case) operation
  ## can be executed. In our case, this instruction compares the array length (istore_3/iload_3)
  ## with 0 (istore/iload). If it's true, it executes all instructions to the line specified
  ## after if_cimpge. In this case, it'll execute all lines until 32nd.
  ## 
  ## Next instructions (aaload, astore, aload_0, aload) represent the first line of foreach
  ## loop where we set currently iterated element to object called name. 
  ## 
  ## After the number 23 we can see an instruction called invokespecial. This instruction is
  ## present every time when we try to execute one method. The instruction is followed by the
  ## index of called method in constant pool (2 in our case). We can see it by generating the
  ## output with -v parameter (verbose). In this case, following entry will appear in the
  ## screen: 
  ## #2 = Methodref          #13.#42        
  ## //  com/waitingforcode/com/waitingforcode/bytecode/BytecodeMethodSample.invokedNamer:
  ## (Ljava/lang/String;)V
  ## 
  ## iinc instruction is responsible for incrementing local variable by the value specified 
  ## after coma (1 in our case).
  ## 
  ## The last executed instruction before return is goto. As its name indicates, it redirects
  ## the execution to another part of bytecode, represented by the number of bytecode
  ## instruction. In our case, it returns to the 8th instruction (iload) and continues the
  ## interation.

  public void externalInvoker(java.lang.String[]);
    Code:
       0: aload_1       
       1: astore_2      
       2: aload_2       
       3: arraylength   
       4: istore_3      
       5: iconst_0      
       6: istore        4
       8: iload         4
      10: iload_3       
      11: if_icmpge     32
      14: aload_2       
      15: iload         4
      17: aaload        
      18: astore        5
      20: aload_0       
      21: aload         5
      23: invokespecial #2                  // Method invokedNamer:(Ljava/lang/String;)V
      26: iinc          4, 1
      29: goto          8
      32: return        

  ## Here we ca, see our private invokedNamer method. As you can see, the first call concers
  ## System.out static method, println. From 0 to 22 instructions we can observe what happens
  ## if we try to print some String. getStatic instruction is used to get static field of 
  ## given class. In our case we System.out which returns new StringBuilder (3: new instruction).
  ## After ldc instruction pushes hard-coded String "String Name is:" onto stack from constant
  ## pool.
  ## invokevirtual is another invocation method. Unlike invokespecial, it calls non-private 
  ## instance methods. invokespecial is destined to private instance methods, instance-
  ## initialization methods and overriden methods of a superclass. The #${number} after 
  ## invocation instruction reffers to methods in constant pool. It's valable as well for 
  ## invoke* as for getstatic.
  ## If we take the analyze from 25th line, you can observe the loading of the first method
  ## parameter (aload_0) and assigning it to local field called userName. 
  ## From lines 30-32 you can see that it's still object from method signature which is loaded. 
  ## After that we get 10th field from constant pool (32: getfield #10) which corresponds to our
  ## userName field. iconst_1 means that we load an integer constant with value "1" from stack.
  ## After calling String.substring method, we override the value of userName field with, once
  ## again, the call of "putfield #10".

  private void invokedNamer(java.lang.String);
    Code:
       0: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
       3: new           #4                  // class java/lang/StringBuilder
       6: dup           
       7: invokespecial #5                  // Method java/lang/StringBuilder."<init>":()V
      10: ldc           #6                  // String Name is: 
      12: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      15: aload_1       
      16: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      19: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      22: invokevirtual #9                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      25: aload_0       
      26: aload_1       
      27: putfield      #10                 // Field userName:Ljava/lang/String;
      30: aload_0       
      31: aload_0       
      32: getfield      #10                 // Field userName:Ljava/lang/String;
      35: iconst_1      
      36: invokevirtual #11                 // Method java/lang/String.substring:(I)Ljava/lang/String;
      39: putfield      #10                 // Field userName:Ljava/lang/String;
      42: return        

  ## As you can see, overriden methods are the same as normal methods. They're no
  ## @Overriden annotation traces. If you're looking here, at the begin the ldc
  ## instruction is invoked. It's in charge of pushing single-word constant onto
  ## stack ("Overriden String value" of type String in our case). After that,
  ## areturn instruction returns previously pushed String.

  public java.lang.String toString();
    Code:
       0: ldc           #12                 // String Overriden String value
       2: areturn       
}

Bytecode method instructions

Below list shows some of bytecode's methods:

As you can see in given list, some instructions are followed by underscore (_) and one number. In fact, these numbers mean the index of given element. For example: istore_0 means the storage of one integer value in the variable indexed with 0. And this rule can be applied for all instructions suffixed with pattern _${number}. If one method can be written with or without this suffix, and it's written without, the index is defined after the name of instruction.

Unlike the first article, this one was more concrete. We saw different ways to deal with methods invocation. We also discovered how bytecode translates code containing arrays or static fields.


If you liked it, you should read:

📚 Newsletter Get new posts, recommended reading and other exclusive information every week. SPAM free - no 3rd party ads, only the information about waitingforcode!