Java Stream API

Jakob Jenkov
Last update: 2019-04-28

The Java Stream API provides a functional approach to processing collections of objects. The Java Stream API was added in Java 8 along with several other functional programming features. This Java Stream tutorial will explain how these functional streams work, and how you use them.

The Java Stream API is not related to the Java InputStream and Java OutputStream of Java IO. The InputStream and OutputStream are related to streams of bytes. The Java Stream API is for processing streams of objects - not bytes.

Java Stream API Tutorial - Video Version

I have a video version of this tutorial here:

Java Stream Definition

A Java Stream is a component that is capable of internal iteration of its elements, meaning it can iterate its elements itself. In contrast, when you are using the Java Collections iteration features (e.g a Java Iterator or the Java for-each loop used with a Java Iterable) you have to implement the iteration of the elements yourself.

Stream Processing

You can attach listeners to a Stream. These listeners are called when the Stream iterates the elements internally. The listeners are called once for each element in the stream. That way each listener gets to process each element in the stream. This is referred to as stream processing.

The listeners of a stream form a chain. The first listener in the chain can process the element in the stream, and then return a new element for the next listener in the chain to process. A listener can either return the same element or a new, depending on what the purpose of that listener (processor) is.

Obtain a Stream

There are many ways to obtain a Java Stream. One of the most common ways to obtain a Stream is from a Java Collection. Here is an example of obtaining a Stream from a Java List:

List<String> items = new ArrayList<String>();

items.add("one");
items.add("two");
items.add("three");

Stream<String> stream = items.stream();

This example first creates a Java List, then adds three Java Strings to it. Finally, the example calls the stream() method to obtain a Stream instance.

Terminal and Non-Terminal Operations

The Stream interface has a selection of terminal and non-terminal operations. A non-terminal stream operation is an operation that adds a listener to the stream without doing anything else. A terminal stream operation is an operation that starts the internal iteration of the elements, calls all the listeners, and returns a result.

Here is a Java Stream example which contains both a non-terminal and a terminal operation:

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Stream;

public class StreamExamples {

    public static void main(String[] args) {
        List<String> stringList = new ArrayList<String>();

        stringList.add("ONE");
        stringList.add("TWO");
        stringList.add("THREE");

        Stream<String> stream = stringList.stream();

        long count = stream
            .map((value) -> { return value.toLowerCase(); })
            .count();

        System.out.println("count = " + count);

    }
}

The call to the map() method of the Stream interface is a non-terminal operation. It merely sets a lambda expression on the stream which converts each element to lowercase. The map() method will be covered in more detail later on.

The call to the count() method is a terminal operation. This call starts the iteration internally, which will result in each element being converted to lowercase and then counted.

The conversion of the elements to lowercase does not actually affect the count of elements. The conversion part is just there as an example of a non-terminal operation.

Non-Terminal Operations

The non-terminal stream operations of the Java Stream API are operations that transform or filter the elements in the stream. When you add a non-terminal operation to a stream, you get a new stream back as result. The new stream represents the stream of elements resulting from the original stream with the non-terminal operation applied. Here is an example of a non-terminal operation added to a stream - which results in a new stream:

List<String> stringList = new ArrayList<String>();

stringList.add("ONE");
stringList.add("TWO");
stringList.add("THREE");
    
Stream<String> stream = stringList.stream();
    
Stream<String> stringStream =
    stream.map((value) -> { return value.toLowerCase(); });

Notice the call to stream map() . This call actually returns a new Stream instance representing the original stream of strings with the map operation applied.

You can only add a single operation to a given Stream instance. If you need to chain multiple operations after each other, you will need to apply the second operation to the Stream operation resulting from the first operation. Here is how that looks:

Stream<String> stringStream1 =
        stream.map((value) -> { return value.toLowerCase(); });

Stream<½String> stringStream2 =
        stringStream1.map((value) -> { return value.toUpperCase(); });

Notice how the second call to Stream map() is called on the Stream returned by the first map() call.

It is quite common to chain the calls to non-terminal operations on a Java Stream. Here is an example of chaining the non-terminal operation calls on Java streams:

Stream<String> stream1 = stream
  .map((value) -> { return value.toLowerCase(); })
  .map((value) -> { return value.toUpperCase(); })
  .map((value) -> { return value.substring(0,3); });

Many non-terminal Stream operations can take a Java Lambda Expression as parameter. This lambda expression implements a Java functional interface that fits the given non-terminal operation. For instance, the Function or Predicate interface. The parameter of the non-terminal operation method parameter is typically a functional interface - which is why it can also be implemented by a Java lambda expression.

filter()

The Java Stream filter() can be used to filter out elements from a Java Stream. The filter method takes a Predicate which is called for each element in the stream. If the element is to be included in the resulting Stream, the Predicate should return true. If the element should not be included, the Predicate should return false.

Here is an example of calling the Java Stream filter() method:

Stream<String> longStringsStream = stream.filter((value) -> {
    return value.length() >= 3;
});

map()

The Java Stream map() method converts (maps) an element to another object. For instance, if you had a list of strings it could convert each string to lowercase, uppercase, or to a substring of the original string, or something completely else. Here is a Java Stream map() example:

List<String> list = new ArrayList<String>();
Stream<String> stream = list.stream();

Stream<String> streamMapped = stream.map((value) -> value.toUpperCase());

flatMap()

The Java Stream flatMap() methods maps a single element into multiple elements. The idea is that you "flatten" each element from a complex structure consisting of multiple internal elements, to a "flat" stream consisting only of these internal elements.

For instance, imagine you have an object with nested objects (child objects). Then you can map that object into a "flat" stream consisting of itself plus its nested objects - or only the nested objects. You could also map a stream of Lists of elements to the elements themselves. Or map a stream of strings to a stream of words in these strings - or to the individual Character instances in these strings.

Here is an example that flatmaps a List of strings to the words in each string. This example should give you an idea about how flatMap() can be used to map a single element into multiple elements.

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

stream.flatMap((value) -> {
    String[] split = value.split(" ");
    return (Stream<String>) Arrays.asList(split).stream();
})
.forEach((value) -> System.out.println(value))
;

This Java Stream flatMap() example first creates a List with 3 strings containing book titles. Then a Stream for the List is obtained, and flatMap() called.

The flatMap() operation called on the Stream has to return another Stream representing the flat mapped elements. In the example above, each original string is split into words, turned into a List, and the stream obtained and returned from that List.

Note that this example finishes with a call to forEach() which is a terminal operation. This call is only there to trigger the internal iteration, and thus flat map operation. If no terminal operation was called on the Stream chain, nothing would have happened. No flat mapping would actually have taken place.

distinct()

The Java Stream distinct() method is a non-terminal operation that returns a new Stream which will only contain the distinct elements from the original stream. Any duplicates will be eliminated. Here is an example of the Java Stream distinct() method:

List<String> stringList = new ArrayList<String>();

stringList.add("one");
stringList.add("two");
stringList.add("three");
stringList.add("one");

Stream<String> stream = stringList.stream();

List<String> distinctStrings = stream
        .distinct()
        .collect(Collectors.toList());

System.out.println(distinctStrings);

In this example the element one appears 2 times in the original stream. Only the first occurrence of this element will be included in the Stream returned by distinct(). Thus, the resulting List (from calling collect()) will only contain one, two and three. The output printed from this example will be:

[one, two, three]

limit()

The Java Stream limit() method can limit the number of elements in a stream to a number given to the limit() method as parameter. The limit() method returns a new Stream which will at most contain the given number of elements. Here is a Java Stream limit() example:

List<String> stringList = new ArrayList<String>();

stringList.add("one");
stringList.add("two");
stringList.add("three");
stringList.add("one");

Stream<String> stream = stringList.stream();
stream
    .limit(2)
    .forEach( element -> { System.out.println(element); });

This example first creates a Stream, then calls limit() on it, and then calls forEach() with a lambda that prints out the elements in the stream. Only the two first elements will be printed because of the limit(2) call.

peek()

The Java Stream peek() method is a non-terminal operation that takes a Consumer (java.util.function.Consumer) as parameter. The Consumer will get called for each element in the stream. The peek() method returns a new Stream which contains all the elements in the original stream.

The purpose of the peek() method is, as the method says, to peek at the elements in the stream, not to transform them. Keep in mind that the peek method does not start the internal iteration of the elements in the stream. You need to call a terminal operation for that. Here is a Java Stream peek() example:

List<String> stringList = new ArrayList<String>();

stringList.add("abc");
stringList.add("def");

Stream<String> stream = stringList.stream();

Stream<String> streamPeeked = stream.peek((value) -> {
    System.out.println("value");
});

Terminal Operations

The terminal operations of the Java Stream interface typicall return a single value. Once the terminal operation is invoked on a Stream, the iteration of the Stream and any of the chained streams will get started. Once the iteration is done, the result of the terminal operation is returned.

A terminal operation typically does not return a new Stream instance. Thus, once you call a terminal operation on a stream, the chaining of Stream instances from non-terminal operation ends. Here is an example of calling a terminal operation on a Java Stream:

long count = stream
  .map((value) -> { return value.toLowerCase(); })
  .map((value) -> { return value.toUpperCase(); })
  .map((value) -> { return value.substring(0,3); })
  .count();

It is the call to count() at the end of the example that is the terminal operation. Since count() returns a long, the Stream chain of non-terminal operations (the map() calls) is ended.

anyMatch()

The Java Stream anyMatch() method is a terminal operation that takes a single Predicate as parameter, starts the internal iteration of the Stream, and applies the Predicate parameter to each element. If the Predicate returns true for any of the elements, the anyMatch() method returns true. If no elements match the Predicate, anyMatch() will return false. Here is a Java Stream anyMatch() example:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

boolean anyMatch = stream.anyMatch((value) -> { return value.startsWith("One"); });
System.out.println(anyMatch);

In the example above, the anyMatch() method call will return true, because the first string element in the stream starts with "One".

allMatch()

The Java Stream allMatch() method is a terminal operation that takes a single Predicate as parameter, starts the internal iteration of elements in the Stream, and applies the Predicate parameter to each element. If the Predicate returns true for all elements in the Stream, the allMatch() will return true. If not all elements match the Predicate, the allMatch() method returns false. Here is a Java Stream allMatch() example:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

boolean allMatch = stream.allMatch((value) -> { return value.startsWith("One"); });
System.out.println(allMatch);

In the example above, the allMatch() method will return false, because only one of the strings in the Stream starts with "One".

noneMatch()

The Java Stream noneMatch() method is a terminal operation that will iterate the elements in the stream and return true or false, depending on whether no elements in the stream matches the Predicate passed to noneMatch() as parameter. The noneMatch() method will return true if no elements are matched by the Predicate, and false if one or more elements are matched. Here is a Java Stream noneMatch() example:

List<String> stringList = new ArrayList<String>();

stringList.add("abc");
stringList.add("def");

Stream<String> stream = stringList.stream();

boolean noneMatch = stream.noneMatch((element) -> {
    return "xyz".equals(element);
});

System.out.println("noneMatch = " + noneMatch);

collect()

The Java Stream collect() method is a terminal operation that starts the internal iteration of elements, and collects the elements in the stream in a collection or object of some kind. Here is a simple Java Stream collect() method example:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

List<String> stringsAsUppercaseList = stream
.map(value -> value.toUpperCase())
.collect(Collectors.toList());

System.out.println(stringsAsUppercaseList);

The collect() method takes a Collector (java.util.stream.Collector) as parameter. Implementing a Collector requires some study of the Collector interface. Luckily, the Java class java.util.stream.Collectors contains a set of pre-implemented Collector implementations you can use, for the most common operations. In the example above, it was the Collector implementation returned by Collectors.toList() that was used. This Collector simply collects all elements in the stream in a standard Java List

count()

The Java Stream count() method is a terminal operation which starts the internal iteration of the elements in the Stream, and counts the elements. Here is a Java Stream count() example:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

long count = stream.flatMap((value) -> {
    String[] split = value.split(" ");
    return (Stream<String>) Arrays.asList(split).stream();
})
.count();

System.out.println("count = " + count);

This example first creates a List of strings, then obtain the Stream for that List, adds a flatMap() operation for it, and then finishes with a call to count(). The count() method will start the iteration of the elements in the Stream which will result in the string elements being split up into words in the flatMap() operation, and then counted. The final result that will be printed out is 14.

findAny()

The Java Stream findAny() method can find a single element from the Stream. The element found can be from anywhere in the Stream. There is no guarantee about from where in the stream the element is taken. Here is a Java Stream findAny() example:

List<String> stringList = new ArrayList<String>();

stringList.add("one");
stringList.add("two");
stringList.add("three");
stringList.add("one");

Stream<String> stream = stringList.stream();

Optional<String> anyElement = stream.findAny();

System.out.println(anyElement.get());

Notice how the findAny() method returns an Optional. The Stream could be empty - so no element could be returned. You can check if an element was found via the Optional isPresent() method.

findFirst()

The Java Stream findFirst() method finds the first element in the Stream, if any elements are present in the Stream. The findFirst() method returns an Optional from which you can obtain the element, if present. Here is a Java Stream findFirst() example:

List<String> stringList = new ArrayList<String>();

stringList.add("one");
stringList.add("two");
stringList.add("three");
stringList.add("one");

Stream<String> stream = stringList.stream();

Optional<String> result = stream.findFirst();

System.out.println(result.get());

You can check if the Optional returned contains an element via its isPresent() method.

forEach()

The Java Stream forEach() method is a terminal operation which starts the internal iteration of the elements in the Stream, and applies a Consumer (java.util.function.Consumer) to each element in the Stream. The forEach() method returns void. Here is a Java Stream forEach() example:

List<String> stringList = new ArrayList<String>();

stringList.add("one");
stringList.add("two");
stringList.add("three");
stringList.add("one");

Stream<String> stream = stringList.stream();

stream.forEach( element -> { System.out.println(element); });

min()

The Java Stream min() method is a terminal operation that returns the smallest element in the Stream. Which element is the smallest is determined by the Comparator implementation you pass to the min() method. I have explained how the Comparator interface works in my tutorial about sorting Java collections. Here is a Java Stream min() example:

List<String> stringList = new ArrayList<String>();

stringList.add("abc");
stringList.add("def");

Stream<String> stream = stringList.stream();

Optional<String> min = stream.min((val1, val2) -> {
    return val1.compareTo(val2);
});

String minString = min.get();

System.out.println(minString);

Notice how the min() method returns an Optional which may or may not contain a result. If the Stream is empty, the Optional get() method will throw a NoSuchElementException.

max()

The Java Stream max() method is a terminal operation that returns the largest element in the Stream. Which element is the largest is determined by the Comparator implementation you pass to the max() method. I have explained how the Comparator interface works in my tutorial about sorting Java collections. Here is a Java Stream max() example:

List<String> stringList = new ArrayList<String>();

stringList.add("abc");
stringList.add("def");

Stream<String> stream = stringList.stream();

Optional<String> max = stream.max((val1, val2) -> {
    return val1.compareTo(val2);
});

String maxString = max.get();

System.out.println(maxString);

Notice how the max() method returns an Optional which may or may not contain a result. If the Stream is empty, the Optional get() method will throw a NoSuchElementException.

reduce()

The Java Stream reduce() method is a terminal operation that can reduce all elements in the stream to a single element. Here is a Java Stream reduce() example:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

Optional<String> reduced = stream.reduce((value, combinedValue) -> {
    return combinedValue + " + " + value;
});

System.out.println(reduced.get());

Notice the Optional returned by the reduce() method. This Optional contains the value (if any) returned by the lambda expression passed to the reduce() method. You obtain the value by calling the Optional get() method.

toArray()

The Java Stream toArray() method is a terminal operation that starts the internal iteration of the elements in the stream, and returns an array of Object containing all the elements. Here is a Java Stream toArray() example:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream = stringList.stream();

Object[] objects = stream.toArray();

Concatenate Streams

The Java Stream interface contains a static method called concat() which can concatenate two streams into one. The result is a new Stream which contains all of the elements from the first stream, followed by all of the elements from the second stream. Here is an example of using the Java Stream concat() method:

List<String> stringList = new ArrayList<String>();

stringList.add("One flew over the cuckoo's nest");
stringList.add("To kill a muckingbird");
stringList.add("Gone with the wind");

Stream<String> stream1 = stringList.stream();

List<String> stringList2 = new ArrayList<>();
stringList2.add("Lord of the Rings");
stringList2.add("Planet of the Rats");
stringList2.add("Phantom Menace");

Stream<String> stream2 = stringList2.stream();

Stream<String> concatStream = Stream.concat(stream1, stream2);

List<String> stringsAsUppercaseList = concatStream
        .collect(Collectors.toList());

System.out.println(stringsAsUppercaseList);

Create Stream From Array

The Java Stream interface contains a static method called of() which can be used to create a Stream from one or more objects. Here is an example of using the Java Stream of() metho:

Stream<String> streamOf = Stream.of("one", "two", "three");

Java Stream API Critique

Having worked with other data streaming API's like the Apache Kafka Streams API, I have a bit of critique of the Java Stream API that I will share with you. They aren't big, important points of critique, but they are useful to have in the back of your head as you venture into stream processing.

Batch, Not Streaming

Despite its name, the Java Stream API is not truly a stream processing API. The Java Stream API's terminal operations return the final result of iterating through all the elements in the stream, and providing the non-terminal and terminal operations to the elements. The result of the terminal operation is returned after the last element in the stream has been processed.

Returning a final result after having processed the last element of a stream is only possible if you know what element is the last in the stream. The only way to know if a given element is the last element in a stream is, if you are processing a batch which has a last element. In contrast, a true stream does not have a last element. You never know if a given element is the last or not. Therefore it is not possible to perform a terminal operation on a stream. The best you can do is to collect the temporary results after the processing of a given element, but this would be sampling, not a final result.

Chain, Not Graph

The Java Stream API is designed so that a Stream instance can only be acted upon once. In other words, you can only add a single non-terminal operation to a Stream, resulting in a new Stream object. You can add another non-terminal operation to the resulting Stream object, but not to the first. The resulting structure of non-terminal Stream instances form a chain.

In a true stream processing API, the root stream and the event listeners can typically form a graph, not just a chain. Multiple listeners can listen to the root stream, and each listener may process the elements in the stream in its own way, and may forward a transformed element as a result. Each listener (non-terminal operation) can thus typically act as a stream itself which other listeners can listen to the results of. This is how Apache Kafka Streams is designed. Each listener (intermediate stream) could also have multiple listeners. The resulting structure forms a graph of listeners with listeners with listeners etc.

With a stream processing graph rather than a chain, there is not a single, final operation in the graph. By final operation I mean an operation which is guaranteed to be the last in the processing chain. Instead there can be multiple final operations. Each "leaf" in the graph is a final operation.

When your stream processing structure can be a graph with multiple final operations, the stream API cannot easily support terminal operations like the Java Stream API does. To support terminal operations easily, there has to be a single, final operation from which the final result is returned. A graph based stream processing API could instead support a "sample" operation where each node in the stream processing graph is asked for any value it may hold internally (e.g. a sum), if any (purely transforming listener nodes will not have any internal state).

Internal, Not External Iteration

The Java Stream API is deliberately designed to have internal iteration of the elements in a Stream. The iteration is started when a terminal operation is invoked on the Stream. In fact, for terminal operations to be able to return a result, the terminal operation has to initiate the iteration of the elements in the Stream.

Some graph based stream processing APIs are also designed to kind of hide the iteration of the elements from the user of the API (e.g. Apache Kafka Streams and RxJava). However, personally I prefer a design where each stream node (root stream and listeners) could have elements passed to them via a method call, and have that element be passed through the complete graph for processing. Such a design would make it easier to test each listener in the graph, as you can configure the graph and push elements through it afterwards, and finally check the result (the sampled state of the graph). Such a design would also enable the stream processing graph to have elements pushed into it via multiple nodes in the graph, and not just via the root stream.

Tweet
	Jakob Jenkov