String memory management in Java

One of the most used objects in Java is String. This omnipresence makes that String is managed a little bit differently than other objects in Java.

In this article we'll focus on String management in the memory. At the first part we'll advance some theoretical concepts. Next we'll write some tests to see String behavior in different situations.

String literal pool and Java memory

String objects in Java can be constructed in two ways:
- as literals, for example: String myString = "my string";
- with explicit constructor, as for example: String myString = new String("my string");

Regarding to the literal method, different things happen in the memory. When we construct String in this way, JVM places the literal ("my string") into something called String literal pool (known also as String constant pool). Now, when another literal String is created with the same value, JVM will get the reference to given String from String literal pool and use it instead of creating new object. This operation is thread safe because String instances are immutable. String's content is defined at construction time and can't be modified further.

String literal pool was initially present in permanent generation. But because of the small size of permgen, Java's engineers decided to move the pool outside it. Since the release of Java 7, String literal pool was moved elsewhere in heap memory (old or young generation). In this article we'll focus on String management in this version. (By the way, note also that permanent generation was removed in Java 8).

When a String is placed to the pool, we consider that it's interned. It means that only one String is stored for each value. This process helps to economize the space taken by String objects. If we are looking at second method of String construction (with constructor), we'll see that a new String object is created - ie. this object isn't interned. But we can intern it by calling String's intern() method.

Now, a new question is to know if interned Strings can be garbage collected ? Yes, because the same GC rule is applied to interned Strings as for any other objects. When given String isn't reachable from the application, it becomes eligible for GC. For example, if you make a new String in literal way, by concatenating two chars inside a for loop, this String will be garbage collected at the end of each iteration:

for (int i = 0; i < 10; i++) {
	String output = "Number "+i;
	System.out.println(output);
}

However, when you are using literal String as constant field of one class, this String will be eligible for GC at the same moment as a class holding it. It can take place when, for example, ClassLoader which loaded given class can be garbage collected.

Tests on Java Strings

After this introduction, let's try to illustrate mentionned differences through simple JUnit test case:

public class StringTest {
	@Test
	public void test() {
		String myText = "my text";
		String anotherText = "my text";
		/**
		 * Two literal Strings share the same object.
		 */
		assertTrue("myText object should be equal to anotherText object", myText == anotherText);
		/**
		 * String created with new operator is considered as different object
		 * that two literal Strings created previously.
		 */
		String objectText = new String("my text");
		assertFalse("objectText object should be equal to myText object", objectText == myText);

		/**
		 * Even that they're two different objects (1 for literal, 1 for new
		 * String), theirs values are the same.
		 */
		assertTrue("The value of anotherText and myText should be the same", anotherText.equals(myText));
		assertTrue("The value of objectText and myText should be the same", objectText.equals(myText));

		/**
		 * String interned with intern() method becomes the same object as
		 * literal ones (it's placed into String literal pool if absent or taken
		 * directly from this pool if present).
		 */
		String newObjectText = objectText.intern();
		assertTrue("myText object should be equal to newObjectText object", newObjectText == anotherText);
		assertTrue("The value of newObjectText and myText should be the same", newObjectText.equals(myText));

		/**
		 * Even if we make intern() call inversely (first intern() and after
		 * intialization of literal String), created objects will be the same.
		 */
		String toIntern = new String("my another text");
		String internedString = toIntern.intern();
		String internedStringLiteral = "my another text";
		assertFalse("internedString shouldn't be the same object as toIntern", toIntern == internedString);
		assertTrue("internedString object should be equal to internedStringLiteral object", internedString == internedStringLiteral);
		assertTrue("The value of internedString and internedStringLiteral should be the same", internedStringLiteral.equals(internedString));
	}
}

As another proof of described features we can take the 3.10.5 chapter of Java Language Specification where we can read that:


- Literal strings within the same class (ยง8) in the same package (ยง7) represent references to the same String object (ยง4.3.1).
- Literal strings within different classes in the same package represent references to the same String object.
- Literal strings within different classes in different packages likewise represent references to the same String object.
- Strings computed by constant expressions (ยง15.28) are computed at compile time and then treated as if they were literals.
- Strings computed by concatenation at run time are newly created and therefore distinct.
- The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.

We can also move that into JUnit test case. But before defining it, we'll write one class in the same package as tested class and one class in different package:

// package com.waitingforcode.test.string.other
public class OtherLetter {

	private String letters = "abc";

	public String getLetters() {
		return this.letters;
	}

}

// package com.waitingforcode.test.string (package of tested class)
public class OtherLetterSamePackage {

	private String letters = "abc";

	public String getLetters() {
		return this.letters;
	}

}

Now we can write test case:

// package com.waitingforcode.test.string
public class StringTestJavaSpec {

	@Test
	public void theSameClassTheSamePackage() {
		TestedClass tested = new TestedClass();
		assertTrue("Literals from the same class, from the same package, should be the same objects but they aren't", tested.getLetters1() == tested.getLetters2());
	}

	@Test
	public void differentClassesTheSamePackage() {
		TestedClass tested1 = new TestedClass();
		OtherLetterSamePackage tested2 = new OtherLetterSamePackage();
		assertTrue("Literals from different class but from the same package should be the same objects but they aren't", tested1.getLetters1() == tested2.getLetters());
	}

	@Test
	public void differentClassesDifferentPackages() {
		OtherLetterSamePackage tested1 = new OtherLetterSamePackage();
		OtherLetter tested2 = new OtherLetter();
		assertTrue("Literals from different class and different packages should be the same objects but they aren't", tested1.getLetters() == tested2.getLetters());
	}

	@Test
	public void computedConstantCompile() {
		String first = "abcde";
		String second = "abc" + "de";
		assertTrue(
				"Strings computed at compile time should be treated as literals, so they could be the same objects", first == second);
	}

	@Test
	public void computedConcatenationCompile() {
		TestedClass tested1 = new TestedClass();
		OtherLetterSamePackage tested2 = new OtherLetterSamePackage();
		String first = tested1.getLetters1() + "d";
		String second = tested2.getLetters() + "d";
		assertFalse("Strings computed at compile time as the result of concatenation shouldn't be treated as the same objects " + first == second);
	}

	@Test
	public void interningString() {
		String preExistent = "abcd";
		TestedClass tested1 = new TestedClass();
		String interned = (tested1.getLetters1() + "d").intern();
		assertTrue("Strings computed and interned should be the same as every already existing Strings", preExistent == interned);
	}

}

class TestedClass {

	private String letters1 = "abc";
	private String letters2 = "abc";

	public String getLetters1() {
		return this.letters1;
	}

	public String getLetters2() {
		return this.letters2;
	}

}

This article explained a little bit the idea of String management in Java. This management evolved in the 2 previous versions of Java. In Java 6 literal Strings were placed in permanent generation while in Java 7 elsewhere in heap space. No literal Strings, constructed with new operator, are considered as normal objects, garbage collectable after the lost of reachability. Literal Strings can be GC when the class being eligible for GC, for example: when ClassLoader becomes garbage collectable.


If you liked it, you should read:

๐Ÿ“š Newsletter Get new posts, recommended reading and other exclusive information every week. SPAM free - no 3rd party ads, only the information about waitingforcode!