Prefer collections over arrays

on waitingforcode.com

Prefer collections over arrays

You're doing Java/C#/JavaScript and doing it great? But you're tired because of always facing the same problems. I was like that 4 years ago. I changed then to the data engineering field and it solved my existential problems :) If you want to follow my path, I prepared a course that will help you with that! Join the class!
Arrays and collections look similar. Both handle a group of similar type of objects. However, they have some differences, going from the simplest maintainability cases going to performance ones.

In this article we'll focus on the differences between arrays and collections in Java. At the first part we'll try to compare the performance of retrieving elements and constructing new objects. The second part will be destined to less "error-oriented" arguments, as maintainability or type safety.

Performance comparison of arrays and collections

To make the performance comparison, we'll use array and ArrayList of the first 400000 numbers. Two methods are placed in separated classes and are launched 10 times. There are these test methods:

public void arraysTest() {
  long start = System.currentTimeMillis();
  int[] ints = new int[400000];
  for (int i = 0; i < ints.length; i++) {
    ints[i] = i;
  }
  long end = System.currentTimeMillis();
  System.out.println("Executed in "+(end - start)+" ms");
  Helper.printMemoryStats();
}

// other class
public void collectionsTest() {
  long start = System.currentTimeMillis();
  List<Integer> ints = new ArrayList<Integer>(400000);
  for (int i = 0; i < 400000; i++) {
    ints.add(i);
  }
  long end = System.currentTimeMillis();
  System.out.println("Executed in "+(end - start)+" ms");
  Helper.printMemoryStats();
}

Helper class used to print memory footprint is:

public abstract class Helper {

    public static void printMemoryStats() {
        long heapSize = Runtime.getRuntime().totalMemory();
        long heapFreeSize = Runtime.getRuntime().freeMemory();

        System.out.println("Used memory "+(heapSize - heapFreeSize));
    }

}

Table with benchmark results looks like:

Method / Try execution time (in ms) 1 2 3 4 5 6 7 8 9 10
arrays 3 3 2 3 2 4 6 5 3 2
collections 20 16 12 12 12 12 11 13 11 16

As you can see, test with arrays takes less time than the test with collections. Sometimes arrays are nearly 10 times faster. At the same time, memory used in the tests with arrays are almost twice smaller than in the case of collections (8313280 against 14351168). By comparing these numbers we could simply deduce that from the performance point of view, arrays are a better choice. However in programming, performance is rarely the single requirement. And in additionally, arrays could potentially appear as worse solution than collections depending on the context (here it's a simple test case with comparison made to give a vein of which option could be faster).

Maintainability of arrays and collections

In terms of code maintainability, collections are better choice. Firstly, you can use several implementations of collections and, even when the application was released some months ago, freely change the implementations. For example, when you notice that you need to store only unique objects, you can switch from ArrayList to HashSet without problems. Do the same thing but with plain arrays would be more difficult. It'll need to implement make some freaky changes in the code, as in following example:

private boolean isAlreadyInTheArray(String nameToAdd, String[] names) {
  for (String name : names) {
    if (name.equals(nameToAdd)) {
      return true;
    }
  }
  return false;
}
// and worse things below, such as computation of length of 
// the array containing unique values

The second pitfalls of arrays is the fact that we need to know the number of elements to store before initializing them. Collections, even if they allow the construction of fixed-size instances, don't expect the knowledge about the numbers of item to store. If you come back to the previous code snippet, you'll see that before initializing array storing only unique names, we'll need to compute the size first, for example as below:

@Test
public void testConstructingUniqueArray() {
  String[] data = {"A", "B", "C", "A", "B", "A", "A"};
  String[] uniqueData = new String[computeUniqueLength(data)];

  assertThat(uniqueData).hasSize(3);
}

private int computeUniqueLength(String[] source) {
  String chained = "";
  int length = 0;
  for (String letter : source) {
    String toInsert = letter+SEPARATOR;
    if (!chained.contains(toInsert)) {
      chained += toInsert;
      length++;
    }
  }
  return length;
}

In additionally, with arrays we must make more operations manually. For example, sorting with Collections consists on define a comparator and call Collections.sort() method. Exactly as in the case of initialization of array containing only unique objects, we must write more code in arrays. And the code written for arrays can be less portable than the use of language standards, such as comparators.

Another pitfall of arrays are theirs error-proning. Not only we need to know the size of array but also we must operate explicitly on indexes (myArray[1] = "My test string") instead of calling appropriated setters without worrying about them (as for example List add("My test string"). Because explicit indexes must be handled (for example auto-incremented somewhere), a risk of bug exists. The same risk doesn't exist in the case of collections where this aspect is managed automatically by given implementation.

The last argument against array is type safety. Below code, even if very imaginary, will produce a runtime exception:

@Test
public void testTypeSafety() {
  String[] names = new String[2];
  Object[] objectNames = names;

  boolean wasAse = false;
  try {
      objectNames[0] = 33;
  } catch (ArrayStoreException ase) {
      wasAse = true;
  }
  assertThat(wasAse).isTrue();

  /*
    * Note that the same code expressed with collections and generics won't compile:
    * <pre>
    * Collection<String> names = new ArrayList<String>();
    * Collection<Object> objectNames = names;
    * </pre>
    */
}

In this article we can discover the differences in use of arrays and collections. At the first part we can observe that arrays occupy less memory and are constructed faster than collections. However, the second part of the article proves that arrays are less maintainable than collections. It's difficult to advocate using of one or another element to store a group of objets. The right solution will depend here on the main requirement.

Share on: