So far only Google Guava allowed us to easily process collections. But the arrival of Java 8 brought a serious alternative to this library - streams.
Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I'm currently writing
one on that topic and the first chapters are already available in π
Early Release on the O'Reilly platform
I also help solve your data engineering problems π contact@waitingforcode.com π©
As you can imagine, this article will describe this new feature of dealing with collections. At the begin we'll describe some basic concepts hidden behind streams. In the second part we'll describe the main features of Streams. At the end we'll show how they can be used to work with collections.
What Streams are ?
Streams can be thought as wrappers of collections which the main goal is to process data in functional way. They can be thought as an illustration of SQL language in the world of Java. As SQL operations, "SELECT...FROM...WHERE", Streams enable finding operations of specific elements in collections. They also, as "GROUP BY" and "LIMIT" clauses, help to aggregate the data. The execution of these operations in Streams is made through a stream pipeline. It consists of 3 families of operations:
- source operation: initialize Streams from: collections, I/O operations, arrays or generator function.
- intermediate operations: they can but don't have to appear in stream pipeline. If they are defined, their main purpose is to make some actions on data. By actions, we can understand: filtering, mapping, limiting, ordering. Each of them return a new Stream object.
- terminal operation: they are the last operations made on streams. They can return data with changes made by intermediate operations or process for the last time remaining items. We can retrieve here aggregation operations (if exists, min, max, sum...) and loop processing (for-each).
To resume this execution channel, we can tell that Streams consist on: defining input data, processing it and generating output. The main features of Streams are:
- no storage: Streams doesn't hold objects but object references.
- laziness: in some cases, streams won't iterate through all held references to generate the result. For example, when we want to get only 5 first elements, we can reduce Stream by dealing exclusively with the 5 first items.
- consumable: once consumed, streams can't be reused.
- parallelism: streams can be sequential and parallel.
Streams features
From the code side, streams are the implementations of typed interface called java.util.stream.Stream<T>. As we mentioned earlier, one of construction possibility consists on using static factory method of(T...values). Another possibility is the call of stream() or parallelStream() method of Collection interface.
In Streams we can find several concepts already implemented in Google Guava:
- predicate matching: we can find if all (matchAll), any anyMatch) or none (noneMatch) elements in the stream match given predicate.
- limiting and skipping: we can as well limit the number of returned elements (limit) as skip some elements in returned stream (skip).
- collecting: is one of terminal operations types. It allows to collect stream elements and assembly them together into single container. We can find there collectors for commonly used collections, as list (Collectors.toList()) or map (Collectors.groupingBy).
- ordering: thanks to sorterd method, we can easily sort items in Streams.
Thanks to some primitive specializations, streams can be used also with primitive types. We can find, among others, IntStream to deal with Integers, LongStream for Longs or another one, DoubleStream for Doubles.
Streams can be closed manually by calling close() method from superinterface of Stream, BaseStream. It implements also java.lang.AutoCloseable interface, so will be closed automatically on try-with-resources construction.
Streams examples
Below, test cases show several features of streams. You can find there the examples of filtering, predicating or aggregation:
public class StreamsTest { private static final String MAN_U = "Manchester United"; private static final String JUVE = "Juventus"; private List<Player> players = new ArrayList<>(); private List<Player> manURemaining = new ArrayList<>(); @Before public void initData() { // Manchester United players players.add(new Player("Roy", "Keane", MAN_U)); players.add(new Player("Ryan", "Giggs", MAN_U)); players.add(new Player("Laurent", "Blanc", MAN_U)); // Manchester United remaining players manURemaining.add(new Player("Peter", "Schmeichel", MAN_U)); manURemaining.add(new Player("Teddy", "Sheringham", MAN_U)); manURemaining.add(new Player("Dwight", "Yorke", MAN_U)); // Juventus FC players players.add(new Player("Michel", "Platini", JUVE)); players.add(new Player("Alessandro", "Del Piero", JUVE)); players.add(new Player("Angelo", "Peruzzi", JUVE)); } @Test public void find_juve_players() { List<Player> juvePlayers = players.stream() .filter(player -> player.getTeam().equals(JUVE)) .collect(Collectors.toList()); assertThat(juvePlayers).extracting("team").containsOnly(JUVE); } @Test public void check_if_only_juve_players() { boolean onlyJuve = players.stream() .allMatch(new Predicate<Player>() { @Override public boolean test(Player player) { return JUVE.equals(player.getTeam()); } }); assertThat(onlyJuve).isFalse(); } @Test public void check_if_only_man_u_or_juve_players() { boolean juveOrManU = players.stream() .anyMatch(new Predicate<Player>() { @Override public boolean test(Player player) { return JUVE.equals(player.getTeam()) || MAN_U.equals(player.getTeam()); } }); assertThat(juveOrManU).isTrue(); } @Test public void check_if_no_milan_players() { boolean noMilanPlayers = players.stream() .noneMatch(new Predicate<Player>() { @Override public boolean test(Player player) { return "AC Milan".equals(player.getTeam()) || "Inter Milan".equals(player.getTeam()); } }); assertThat(noMilanPlayers).isTrue(); } @Test public void convert_to_only_man_u_players() { Iterator<Player> manuRemainingIterator = manURemaining.iterator(); List<Player> manUPlayers = players.stream() .map(player -> player.getTeam().equals(JUVE) ? manuRemainingIterator.next() : player) .collect(Collectors.toList()); assertThat(manUPlayers).extracting("team").containsOnly(MAN_U); assertThat(manuRemainingIterator.hasNext()).isFalse(); } @Test public void covert_to_map_with_players_grouped_by_team() { Map<String, List<Player>> playerByTeam = players.stream() .collect(Collectors.groupingBy(player -> player.getTeam())); assertThat(playerByTeam).hasSize(2); assertThat(playerByTeam).containsKeys(JUVE, MAN_U); assertThat(playerByTeam.get(JUVE)).hasSize(3); assertThat(playerByTeam.get(MAN_U)).hasSize(3); } @Test public void convert_to_ordered_list() { List<Player> orderedPlayers = players.stream() .sorted(new PlayerComparator()) .collect(Collectors.toList()); assertThat(orderedPlayers.get(0).getLastName()).isEqualTo("Blanc"); assertThat(orderedPlayers.get(1).getLastName()).isEqualTo("Del Piero"); assertThat(orderedPlayers.get(2).getLastName()).isEqualTo("Giggs"); assertThat(orderedPlayers.get(3).getLastName()).isEqualTo("Keane"); assertThat(orderedPlayers.get(4).getLastName()).isEqualTo("Peruzzi"); assertThat(orderedPlayers.get(5).getLastName()).isEqualTo("Platini"); } @Test public void pagination_with_limit_and_skip_functions() { // Beware of order of skip() and limit() functions - see next test List<Player> orderedPlayers = players.stream() .sorted(new PlayerComparator()) .skip(3) .limit(3) .collect(Collectors.toList()); assertThat(orderedPlayers).hasSize(3); assertThat(orderedPlayers.get(0).getLastName()).isEqualTo("Keane"); assertThat(orderedPlayers.get(1).getLastName()).isEqualTo("Peruzzi"); assertThat(orderedPlayers.get(2).getLastName()).isEqualTo("Platini"); } @Test public void failing_pagination_with_inversed_limit_and_skip_calls() { // first, we limit players list to only 3-elements sublist, after we skip these 3 elements - at the end we receive an empty list List<Player> orderedPlayers = players.stream() .sorted(new PlayerComparator()) .limit(3) .skip(3) .collect(Collectors.toList()); assertThat(orderedPlayers).isEmpty(); } @Test public void construct_team_with_remaining_players() { List<Player> allPlayers = Stream.concat(players.stream(), manURemaining.stream()) .collect(Collectors.toList()); assertThat(allPlayers).hasSize(9) .extracting("lastName").contains("Blanc", "Del Piero", "Giggs", "Keane", "Peruzzi", "Platini", "Schmeichel", "Sheringham", "Yorke"); } @Test public void init_stream_with_builder() { List<Player> builtPlayers = Stream.<Player>builder().add(new Player("Ole Gunnar", "Solskjaer", MAN_U)) .add(new Player("Andy", "Cole", MAN_U)) .build().collect(Collectors.toList()); assertThat(builtPlayers).hasSize(2) .extracting("lastName").containsOnly("Solskjaer", "Cole"); } @Test public void get_distinct_players_by_teams() { // distinct() is based on equals() method invocation players.add(players.get(0)); players.add(players.get(1)); List<Player> distinctPlayers = players.stream() .distinct() .collect(Collectors.toList()); assertThat(distinctPlayers).hasSize(6); } @Test public void transfer_all_players_to_man_u() { players.stream() .forEach(new Consumer<Player>() { @Override public void accept(Player player) { player.setTeam(MAN_U); } }); assertThat(players).hasSize(6) .extracting("team").containsOnly(MAN_U); } @Test public void reduce_to_get_last_player() { Player lastPlayer = players.stream() .reduce(new BinaryOperator<Player>() { @Override public Player apply(Player previousPlayer, Player nextPlayer) { return nextPlayer; } }).get(); assertThat(lastPlayer.getLastName()).isEqualTo("Peruzzi"); } @Test public void reduce_to_compose_multi_name_player() { Player multiPlayer = players.stream() .reduce(new BinaryOperator<Player>() { @Override public Player apply(Player previousPlayer, Player nextPlayer) { return new Player("", previousPlayer.getLastName() + " " +nextPlayer.getLastName(), ""); } }).get(); assertThat(multiPlayer.getLastName()).isEqualTo("Keane Giggs Blanc Platini Del Piero Peruzzi"); } private static class PlayerComparator implements Comparator<Player> { @Override public int compare(Player player1, Player player2) { return ComparisonChain.start() .compare(player1.getLastName(), player2. getLastName()) .compare(player1.getFirstName(), player2.getFirstName()) .compare(player1.getTeam(), player2.getTeam()) .result(); } } }
This article introduces an alternative to usual way of dealing with collection data. Thanks to streams we can not only reduce the amount of written code but also allow better testability and reusability. We saw that streams consist on defining some entry data and making terminal operation at the end. Meantime we can also make some intermediary operations to, for example, remove wrong items or change theirs properties.