Google Guava : String utilitary classes

One of most "speaking" objects in Java is String. However, to achieve some operations we need to implement own methods. With Google Guava String support, operations on chain of characters are easier.

Through this article we'll see which supplementary features are provided with Google Guava to manipulate Strings. In the first part we'll see which objects can be helpful to better Strings handling. After that we'll pass to some practical cases.

Manipulating Strings with Google Guava

Three main components of String feature in Google Guava are: matcher, splitter and joiner. These names are very explicit and you can correctly deduct that matcher is in charge of working on characters in Strings. It's represented by com.google.common.base.CharMatcher class which is mainly composed by public static methods used to operate on String characters. We can distinguish:
- detection methods: allows to see if given String contains tested characters.
- writing methods: this type of methods allows to change Strings.

CharMatcher class defines these methods through private classes extending CharMatcher, as RangesMatcher, NegatedMatcher, And, Or, FastMatcher, NegatedFastMatcher, BitSetMatcher. These classes are used further to construct matcher usable publicly. The example of this matcher can be a matcher that allows to check if given character is a digit:

// Must be in ascending order.
private static final String ZEROES = "0\u0660\u06f0\u07c0\u0966\u09e6\u0a66\u0ae6\u0b66\u0be6"
  + "\u0c66\u0ce6\u0d66\u0e50\u0ed0\u0f20\u1040\u1090\u17e0\u1810\u1946\u19d0\u1b50\u1bb0"
  + "\u1c40\u1c50\ua620\ua8d0\ua900\uaa50\uff10";

private static final String NINES;
static {
  StringBuilder builder = new StringBuilder(ZEROES.length());
  for (int i = 0; i < ZEROES.length(); i++) {
    builder.append((char) (ZEROES.charAt(i) + 9));
  }
  NINES = builder.toString();
}
public static final CharMatcher DIGIT = new RangesMatcher("CharMatcher.DIGIT", ZEROES.toCharArray(), NINES.toCharArray());

Splitter is an object used to separate String. It's represented by Splitter class from the same package as CharMatcher. Unlike standard String split(String regex) method, Google Guava's splitter has a lot of supplementary features and improvements. For example, it helps to handle conflictual situations with trailing splitted characters, allows us to split a String against more than one value or split a String by ignoring empty Strings in the returned array.

The last feature, joiner, is responsible for concatenating Strings represented as arrays, maps, varargs or Iterable instances. This functionality is done thanks to Joiner class. The joining methods are very flexible because they allow, for example, to ignore null objects or specify the joining character (as for example ";" to construct CSV entries).

Example of String manipulation with Google Guava

We'll illustrate three main components by 3 test cases, each for every component. All explaining comments are included inside the code:

public class StringTest {

  @Test
  public void splitter() {
    /**
      * Thanks to this method we can split given String and get only expected values, as not empty chars. Following methods show how to achieve it, step by step. 
      * We start by classic split.
      */
    String text = "A, ,,,,,,,,,,,,B";
    Iterable simpleSplit = Splitter.on(",").split(text);
    String[] expectedSimple = {"A", " ", "", "", "", "", "", "", "", "", "", "", "", "B"};
    int i = 0;
    for (String entry :  simpleSplit) {
            assertTrue("Entry should be '"+expectedSimple[i]+"' but is '"+entry+"'", entry.equals(expectedSimple[i]));
            i++;
    }
    // Now we introduce the feature of trimming splitted entries. We use trimResults() before split(String text) invocation. In additionally, we need to modify the 2nd entry in expectedSimple array.
    Iterable trimmedSplit = Splitter.on(",").trimResults().split(text);
    expectedSimple[1] = "";
    i = 0;
    for (String entry :  trimmedSplit) {
            assertTrue("Entry should be '"+expectedSimple[i]+"' but is '"+entry+"'", entry.equals(expectedSimple[i]));
            i++;
    }
    // Finally, we'll call omitEmptyStrings to ignore empty entries on splitted array and do not to have to deal with them
    String[] expectedFinal = {"A", "B"};
    Iterable finalResult = Splitter.on(",").trimResults().omitEmptyStrings().split(text);
    i = 0;
    for (String entry : finalResult) {
            assertTrue("Entry should be '"+expectedFinal[i]+"' but is '"+entry+"'", entry.equals(expectedFinal[i]));
            i++;
    }

    /**
      *  They're another interesting feature which allows us to divide given String on fixed-length characters. For example, below code will divide String on 2-length characters.
      *  
      *  Note that fixedLength is also empty Strings-aware and if you not specify omitEmptyStrings, empty Strings will be returned (as an entry "M " in our sample).
      */
    Iterable resultFixed = Splitter.fixedLength(2).split("ABCDEFGHIJKLM N");
    String[] expectedFixed = {"AB", "CD", "EF", "GH", "IJ", "KL", "M ", "N"};
    i = 0;
    for (String entry : resultFixed) {
            assertTrue("Entry should be '"+expectedFixed[i]+"' but is '"+entry+"'", entry.equals(expectedFixed[i]));
            i++;
    }
    /**
      * We can also use Splitter on one from given characters. It can be achieved with CharMatcher.anyOf method. It indicates to use one from defined characters as a matching or splitting condition.
      */
    Iterable resultAnyOf = Splitter.on(CharMatcher.anyOf(",;:")).omitEmptyStrings().trimResults().split("A,B,C: first letters in alphabet; ie. ABC");
    String[] expectedAnyOf = {"A", "B", "C", "first letters in alphabet", "ie. ABC"};
    i = 0;
    for (String entry : resultAnyOf) {
            assertTrue("Entry should be '"+expectedAnyOf[i]+"' but is '"+entry+"'", entry.equals(expectedAnyOf[i]));
            i++;
    }
    /**
      *  And the last thing, limit of splitted values. In our case, we are limited to result composed by 3 elements: two first results are splitted and the last one is composed with the remaining elements.
      */
    Iterable resultLimited = Splitter.fixedLength(2).limit(3).split("ABCDEFGHIJKLM N");
    String[] expectedLimited = {"AB", "CD", "EFGHIJKLM N"};
    i = 0;
    for (String entry : resultLimited) {
            assertTrue("Entry should be '"+expectedLimited[i]+"' but is '"+entry+"'", entry.equals(expectedLimited[i]));
            i++;
    }               
  }

  @Test
  public void joiner() {
    /**
      * We'll illustrate Joiner use with all supported objects (arrays, varargs, Maps, Iterable instances). The first example will split null entries, ie. they won't be concatenated. The rest of 
      * examples will replace null entries by specified String. All examples will be concatenated with ";" character.
      * 
      * So, from the second example, we'll replace all null entries by the String specified in userForNull(String replacer) method.
      */
    String[] lettersArray = {"A", null, "B", "C", "D", null, "E", "F"};
    String expected = "A;B;C;D;E;F";
    String lettersArrayResult = Joiner.on(";").skipNulls().join(lettersArray);
    assertTrue("Concatenated result with array Joiner ("+lettersArrayResult+") should be the same as expected ("+expected+")", expected.equals(lettersArrayResult));
    
    String varargsResult = Joiner.on(";").useForNull("!EMPTY!").join("A", null, "B", "C", "D", null, "E", "F");
    expected = "A;!EMPTY!;B;C;D;!EMPTY!;E;F";
    assertTrue("Concatenated result with varargs Joiner ("+varargsResult+") should be the same as expected ("+expected+")", expected.equals(varargsResult));
    
    List lettersIterable = new ArrayList();
    for (String letter : lettersArray) {
            lettersIterable.add(letter);
    }
    Iterable iterable = lettersIterable;
    String iterableResult = Joiner.on(";").useForNull("!EMPTY!").join(iterable);
    assertTrue("Concatenated result with Iterable Joiner ("+iterableResult+") should be the same as expected ("+expected+")", expected.equals(iterableResult));
    
    Iterator iterator = lettersIterable.iterator();
    String iteratorResult = Joiner.on(";").useForNull("!EMPTY!").join(iterator);
    assertTrue("Concatenated result with Iterable Joiner ("+iteratorResult+") should be the same as expected ("+expected+")", expected.equals(iteratorResult));
  }
  
  @Test
  public void matcher() {
    /**
      * We will check if given two Strings contain at least one vowel.
      */
    CharMatcher anyVowelMatcher = CharMatcher.anyOf("aeiouyAEIOUY");
    assertFalse("'PPPL' doesn't contain any vowel but matcher tells that it does", anyVowelMatcher.matchesAnyOf("PPPL"));
    assertTrue("'AAA' contains any vowel but matchet tells that it doesn't", anyVowelMatcher.matchesAnyOf("AAA"));
    int counted = anyVowelMatcher.countIn("AAABBCCCEEEe");
    assertTrue("'AAABBCCCEEEe' should contain 7 characters from anyVowelMatcher but it contains only "+counted, counted == 7);
    /**
      * We can also match exactly expected character.
      */
    CharMatcher digitRemoverMatcher = CharMatcher.DIGIT;
    String replacedResult = digitRemoverMatcher.replaceFrom("test007a", "!");
    assertTrue("Expected result is 'test!!!a' but '"+replacedResult+"' was received", replacedResult.equals("test!!!a"));
    /**
      * Example of three matchers applied on one result with or() method.
      */
    CharMatcher invisibleMatcher = CharMatcher.INVISIBLE;
    String invisibleResult = invisibleMatcher.or(digitRemoverMatcher).or(anyVowelMatcher).removeFrom(" t e s t 0 0 7 ");
    assertTrue("Expected result is 'test' but '"+invisibleResult+"' was received", invisibleResult.equals("tst"));
    /**
      * Now we want to retain all non vowels.
      */
    CharMatcher noVowelReplacer = CharMatcher.noneOf("aeiouyAEIOUY");
    String noVowelResult = noVowelReplacer.retainFrom("This is a test String");
    String expectedNoVowel = "Ths s  tst Strng";
    assertTrue("Expected result is '"+expectedNoVowel+"' but '"+noVowelResult+"' was received", noVowelResult.equals(expectedNoVowel));
    /**
      * This matcher can check if given String contains only characters defined in matcher - Java mining letters in our case. The first test will return false because of method implemented to test
      * if given character is a letter (@see java.lang.Character#isLetter()). This method returns false on whitespace character (" ").
      * 
      * The second test takes the same String but without whitespaces. In this case, we should get true for exclusiveLetterMatcher.
      */
    CharMatcher exclusiveLetterMatcher = CharMatcher.JAVA_LETTER;
    boolean isLetter = Character.isLetter(' ');
    boolean onlyLetters = exclusiveLetterMatcher.matchesAllOf("This a test of a French text étrange which means stranger");
    assertTrue("onlyLetters ("+onlyLetters+") should be the same as Character.isLetter(\" \") result ("+isLetter+")", onlyLetters == isLetter);

    onlyLetters = exclusiveLetterMatcher.matchesAllOf("ThisatestofaFrenchtextétrangewhichmeansstranger");
    assertTrue("onlyLetters ("+onlyLetters+") should be true thanks to whitespace removing", onlyLetters);
  }
}

This article presented how to optimize Strings treatment with Google Guava. We discovered that we can use CharMatcher objects to matching purposes, instead of for example regular expressions. We've also learned that Splitter provides supplementary features as null stripping or null replacing. And the last object, Joiner, comes with some features simplifying String concatenation.


If you liked it, you should read:

📚 Newsletter Get new posts, recommended reading and other exclusive information every week. SPAM free - no 3rd party ads, only the information about waitingforcode!