Introduction to Java's bytecode reading on waitingforcode.com

When you're learning Java concurrency, you surely hear very often about barriers. These barriers are expressed explicitly in bytecode, a intermediary form between code wrote by human and interpreted by machine. In this article we'll begin to learn the interpretation of bytecode to understand better what happen with it after, when machine uses it.

Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I wrote one on that topic! You can read it online on the O'Reilly platform, or get a print copy on Amazon.

I also help solve your data engineering problems 👉 contact@waitingforcode.com 📩

At the begin of this article we'll discover how to transform compiled .class file into output composed by bytecode instructions. At the second article we'll use this technique to learn some basic instructions of bytecode.

Transform .class file into bytecode output in Java

The tool used commonly to read class bytecode is called Java Class File Disassembler and can be ran with javap command. Let's start by see which options can be specified for it:

javap --help
Usage: javap  
where possible options include:
  -help  --help  -?        Print this usage message
  -version                 Version information
  -v  -verbose             Print additional information
  -l                       Print line number and local variable tables
  -public                  Show only public classes and members
  -protected               Show protected/public classes and members
  -package                 Show package/protected/public classes
                           and members (default)
  -p  -private             Show all classes and members
  -c                       Disassemble the code
  -s                       Print internal type signatures
  -sysinfo                 Show system info (path, size, date, MD5 hash)
                           of class being processed
  -constants               Show static final constants
  -classpath         Specify where to find user class files
  -bootclasspath     Override location of bootstrap class files

As you can see in the list, the most important option for us is -c. This option "disassembles" the code, ie. converts .class code to bytecode. If we use a simple javap command without -c, we'll receive a simple list of methods contained in given .class file:

Compiled from "BytecodeSample.java"
public class com.waitingforcode.BytecodeSample {
  public com.waitingforcode.BytecodeSample();
  public void doNothing();
  public java.lang.String doNothingWithString(java.lang.String);
}

For "javaps" with -c option, we'll receive more verbose output:

Compiled from "BytecodeSample.java"
public class com.waitingforcode.BytecodeSample {
  public com.waitingforcode.BytecodeSample();
    Code:
       0: aload_0       
       1: invokespecial #1                  // Method java/lang/Object."":()V
       4: return        

  public void doNothing();
    Code:
       0: return        

  public java.lang.String doNothingWithString(java.lang.String);
    Code:
       0: aload_1       
       1: areturn       
}

Note at this stage that only public and protected properties (fields, methods) are printed.

Other options without classpath ones (-classpath and -bootclasspath) can be used to customize the verbosity of printed bytecode. Let's make our javap the most verbose as possible by invoking javap -c -sysinfo -p -version -v -l BytecodeSample.class. The output will be really generous:

show verbose javap output

1.7.0_60
Classfile /home/bartosz/doc/code/webapp/target/test-classes/com/waitingforcode/BytecodeSample.class
  Last modified 31 juil. 2014; size 890 bytes
  MD5 checksum 6a286ead73d388bdfb769c376baa6214
  Compiled from "BytecodeSample.java"
public class com.waitingforcode.BytecodeSample
  SourceFile: "BytecodeSample.java"
  minor version: 0
  major version: 51
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #5.#31         //  java/lang/Object."<init>":()V
   #2 = Fieldref           #4.#32         //  com/waitingforcode/BytecodeSample.age:I
   #3 = Fieldref           #4.#33         //  com/waitingforcode/BytecodeSample.normalAge:I
   #4 = Class              #34            //  com/waitingforcode/BytecodeSample
   #5 = Class              #35            //  java/lang/Object
   #6 = Utf8               NAME
   #7 = Utf8               Ljava/lang/String;
   #8 = Utf8               ConstantValue
   #9 = String             #36            //  BytecodeSample-CLASS
  #10 = Utf8               LOCAL_NAME
  #11 = String             #37            //  bcs-CLASS
  #12 = Utf8               age
  #13 = Utf8               I
  #14 = Utf8               normalAge
  #15 = Utf8               
  #16 = Utf8               ()V
  #17 = Utf8               Code
  #18 = Utf8               LineNumberTable
  #19 = Utf8               LocalVariableTable
  #20 = Utf8               this
  #21 = Utf8               Lcom/waitingforcode/BytecodeSample;
  #22 = Utf8               doNothing
  #23 = Utf8               doNothingWithString
  #24 = Utf8               (Ljava/lang/String;)Ljava/lang/String;
  #25 = Utf8               text
  #26 = Utf8               samplePrivateMethod
  #27 = Utf8               setAge
  #28 = Utf8               (I)V
  #29 = Utf8               SourceFile
  #30 = Utf8               BytecodeSample.java
  #31 = NameAndType        #15:#16        //  "<init>":()V
  #32 = NameAndType        #12:#13        //  age:I
  #33 = NameAndType        #14:#13        //  normalAge:I
  #34 = Utf8               com/waitingforcode/BytecodeSample
  #35 = Utf8               java/lang/Object
  #36 = Utf8               BytecodeSample-CLASS
  #37 = Utf8               bcs-CLASS
{
  public static final java.lang.String NAME;
    flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
    ConstantValue: String BytecodeSample-CLASS


  private static final java.lang.String LOCAL_NAME;
    flags: ACC_PRIVATE, ACC_STATIC, ACC_FINAL
    ConstantValue: String bcs-CLASS


  protected int age;
    flags: ACC_PROTECTED


  public int normalAge;
    flags: ACC_PUBLIC


  public com.waitingforcode.BytecodeSample();
    flags: ACC_PUBLIC
    LineNumberTable:
      line 8: 0
      line 12: 4
      line 13: 9
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      16     0  this   Lcom/waitingforcode/BytecodeSample;
    Code:
      stack=2, locals=1, args_size=1
         0: aload_0       
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: aload_0       
         5: iconst_0      
         6: putfield      #2                  // Field age:I
         9: aload_0       
        10: bipush        30
        12: putfield      #3                  // Field normalAge:I
        15: return        
      LineNumberTable:
        line 8: 0
        line 12: 4
        line 13: 9
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
               0      16     0  this   Lcom/waitingforcode/BytecodeSample;

  public void doNothing();
    flags: ACC_PUBLIC
    LineNumberTable:
      line 17: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       1     0  this   Lcom/waitingforcode/BytecodeSample;
    Code:
      stack=0, locals=1, args_size=1
         0: return        
      LineNumberTable:
        line 17: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
               0       1     0  this   Lcom/waitingforcode/BytecodeSample;

  public java.lang.String doNothingWithString(java.lang.String);
    flags: ACC_PUBLIC
    LineNumberTable:
      line 20: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       2     0  this   Lcom/waitingforcode/BytecodeSample;
             0       2     1  text   Ljava/lang/String;
    Code:
      stack=1, locals=2, args_size=2
         0: aload_1       
         1: areturn       
      LineNumberTable:
        line 20: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
               0       2     0  this   Lcom/waitingforcode/BytecodeSample;
               0       2     1  text   Ljava/lang/String;

  private void samplePrivateMethod();
    flags: ACC_PRIVATE
    LineNumberTable:
      line 25: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       1     0  this   Lcom/waitingforcode/BytecodeSample;
    Code:
      stack=0, locals=1, args_size=1
         0: return        
      LineNumberTable:
        line 25: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
               0       1     0  this   Lcom/waitingforcode/BytecodeSample;

  protected void setAge(int);
    flags: ACC_PROTECTED
    LineNumberTable:
      line 28: 0
      line 29: 5
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       6     0  this   Lcom/waitingforcode/BytecodeSample;
             0       6     1   age   I
    Code:
      stack=2, locals=2, args_size=2
         0: aload_0       
         1: iload_1       
         2: putfield      #2                  // Field age:I
         5: return        
      LineNumberTable:
        line 28: 0
        line 29: 5
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
               0       6     0  this   Lcom/waitingforcode/BytecodeSample;
               0       6     1   age   I
}

Bytecode basics

We'll start by analyzing sample output composed by methods (empty and not empty signature, methods with and without return, of all 4 visibility), fields (private, public, protected and package-private), constants, enums. We'll also explore two different types of constructors: without parameters and with parameters. We'll reduce the output to simple bytecode commands by calling javap -c -p BytecodeSample.class. In this exercise, we'll try to reconstruct .java file directly from bytecode output which looks as:

Compiled from "BytecodeSample.java"
public class com.waitingforcode.BytecodeSample {

  ## As you can notice, all fields are ordered. Even if they appear in the
  ## middle of the body in .java file, they'll appear at the begin of bytecoded class.
  ## Note also that the values of the fields aren't defined. We'll come back to this after. 
  ## You can also observe that they're no "magic" replacements for 
  ## visibility modifiers and that primitive types are kept.
  
  public static final java.lang.String NAME;

  public int normalAge;

  static final java.lang.String NAME_PP;

  int normalAgePP;

  protected static final java.lang.String NAME_PRO;

  protected int normalAgePro;

  private static final java.lang.String NAME_PRI;

  private int normalAgePri;

  ## Below you can find the definition of both constructors (with and without parameters in signature).
  ## You can observe that they're no name for parameter in the second constructor. They're only the
  ## parameter's type. 
  ## 
  ## As you can see, they're some lines with names similars to methods of programming language. And that's it.
  ## This output contains all instructions passed by JVM to machine. In the constructors we can distinguish
  ## the definition of fields values with the call of putfield instructions. As you can see, we can know which
  ## field is defined thanks to "// Field ${fieldName}" fragment.
  ## 
  ## bipush instruction pushes fields objects onto stack. As you can see, all bipush invocations are followed
  ## by 30 and 30 is the value associated with all int fields of the class. This instruction can be used only
  ## for integers from -128 to 127.

  public com.waitingforcode.BytecodeSample();
    Code:
       0: aload_0       
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: aload_0       
       5: bipush        30
       7: putfield      #2                  // Field normalAge:I
      10: aload_0       
      11: bipush        30
      13: putfield      #3                  // Field normalAgePP:I
      16: aload_0       
      17: bipush        30
      19: putfield      #4                  // Field normalAgePro:I
      22: aload_0       
      23: bipush        30
      25: putfield      #5                  // Field normalAgePri:I
      28: return        

  public com.waitingforcode.BytecodeSample(java.lang.String);
    Code:
       0: aload_0       
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: aload_0       
       5: bipush        30
       7: putfield      #2                  // Field normalAge:I
      10: aload_0       
      11: bipush        30
      13: putfield      #3                  // Field normalAgePP:I
      16: aload_0       
      17: bipush        30
      19: putfield      #4                  // Field normalAgePro:I
      22: aload_0       
      23: bipush        30
      25: putfield      #5                  // Field normalAgePri:I
      28: return        

  public void doNothing();
    Code:
       0: return        

  public java.lang.String doNothingReturn();
    Code:
       0: ldc           #6                  // String text
       2: areturn       

  public java.lang.String returnString();
    Code:
       0: ldc           #7                  // String String
       2: areturn       

  ## If you are comparing methods returning void with the methods
  ## returning an object, you can see that both invoke some "return" 
  ## method at the end. 
  ## 
  ## Unlike in common programmer code, bytecode's "return" instruction sends void 
  ## result and "areturn" send a non-void one (as String in next line). 

  public java.lang.String returnStringWithParam(java.lang.String);
    Code:
       0: aload_1       
       1: areturn       

  void doNothingPP();
    Code:
       0: return        

  java.lang.String doNothingReturnPP();
    Code:
       0: ldc           #6                  // String text
       2: areturn       

  java.lang.String returnStringPP();
    Code:
       0: ldc           #7                  // String String
       2: areturn       

  ## If you compare the body of methods with and without parameters,
  ## you can observe the difference at the level of the first executed
  ## instruction. For parameter methods, aload_${NUMBER} load the
  ## object's reference onto stack from the parameter placed at
  ## ${NUMBER} position. So if we had two String parameters, we should
  ## see in method's body aload_1 and aload_2 instructions.

  java.lang.String returnStringWithParamPP(java.lang.String);
    Code:
       0: aload_1       
       1: areturn       

  protected void doNothingPro();
    Code:
       0: return        

  ## A very interesting instruction is used in this method. As you can see, it
  ## doesn't have any parameters in signature. However, it returns a String. If
  ## we want to know which String is returned, we can see at the first 
  ## instruction - ldc - which gets one constant value ("text" in our case) and
  ## push it onto stack. The values are got from constant pool and are identified
  ## by "#${NUMBER}" expression. This expression indicates the index of retrieved
  ## value in the constant pool.

  protected java.lang.String doNothingReturnPro();
    Code:
       0: ldc           #6                  // String text
       2: areturn       

  protected java.lang.String returnStringPro();
    Code:
       0: ldc           #7                  // String String
       2: areturn       

  protected java.lang.String returnStringWithParamPro(java.lang.String);
    Code:
       0: aload_1       
       1: areturn       

  private void doNothingPri();
    Code:
       0: return        

  private java.lang.String doNothingReturnPri();
    Code:
       0: ldc           #6                  // String text
       2: areturn       

  private java.lang.String returnStringPri();
    Code:
       0: ldc           #7                  // String String
       2: areturn       

  private java.lang.String returnStringWithParamPri(java.lang.String);
    Code:
       0: aload_1       
       1: areturn       
}

As you can see with commands in javap output, the printed form is explicit. Below you can find Java class used to generate bytecode output:

show printed Java class

  
/**
 * Sample class containing basic bytecode elements.
 */
public class BytecodeSample {

  private enum Choices {
    YES, NO;
  }

  public BytecodeSample() {
  }

  public BytecodeSample(String param1) {
  }

  // PUBLIC PART
  public static final String NAME = "BytecodeSample-CLASS";

  public int normalAge = 30;

  public void doNothing() {
  }

  public String doNothingReturn() {
    return "text";
  }

  public String returnString() {
    return "String";
  }

  public String returnStringWithParam(String text) {
    return text;
  }

  // PACKAGE-PRIVATE PART
  static final String NAME_PP = "BytecodeSample-CLASS";

  int normalAgePP = 30;

  void doNothingPP() {
  }

  String doNothingReturnPP() {
      return "text";
  }

  String returnStringPP() {
    return "String";
  }

  String returnStringWithParamPP(String text) {
    return text;
  }

  // PROTECTED PART
  protected static final String NAME_PRO = "BytecodeSample-CLASS";

  protected int normalAgePro = 30;

  protected void doNothingPro() {
  }

  protected String doNothingReturnPro() {
    return "text";
  }

  protected String returnStringPro() {
    return "String";
  }

  protected String returnStringWithParamPro(String text) {
    return text;
  }

  // PRIVATE PART
  private static final String NAME_PRI = "BytecodeSample-CLASS";

  private int normalAgePri = 30;

  private void doNothingPri() {
  }

  private String doNothingReturnPri() {
    return "text";
  }

  private String returnStringPri() {
    return "String";
  }

  private String returnStringWithParamPri(String text) {
    return text;
  }

}

This article introduced us into the world of Java's bytecode interpretation. At the begin we discovered how to generate bytecode from .class file. We used javap command for it. The second part was about an introduction to basic instructions of bytecode. We discovered how the variables are transmitted onto stock (bipush) and how they're defined for the class (putfield, ldc). We also saw the difference between methods returning void and returning something (areturn and return instructions).

Consulting

With nearly 16 years of experience, including 8 as data engineer, I offer expert consulting to design and optimize scalable data solutions. As an O’Reilly author, Data+AI Summit speaker, and blogger, I bring cutting-edge insights to modernize infrastructure, build robust pipelines, and drive data-driven decision-making. Let's transform your data challenges into opportunities—reach out to elevate your data engineering game today!

👉 contact@waitingforcode.com
🔗 past projects