Contents

Java streams are lazy

We all know how to use lambda expressions in java. But recently, I’ve made horrible mistake by not understanding that java stream API is lazy.

We all know how to use lambda expressions in java. I’ve been using them for so long, that I can’t even imagine writing code without them. And up until today I confidently considered myself an expert in lambdas: I knew how to filter, map, collect and even peek and reduce. If you do too, then take a look at this code, for example:

1
2
3
4
Stream<Integer> myHashedPasswords = Stream.of("passswd", "pass", "assword")
    .peek(value -> System.out.println("About to hash: " + value))
    .map(String::hashCode)
    .peek(n -> System.out.println("Done hashing, result: " + n));

Question: how many lines will this code print into stdout if executed?

If you think that an answer is 6 (2 lines for each element), then think again. The correct answer would be 0 because java streams are lazily executed.

Laziness = performance

The Java 8 Streams API is fully based on the ‘process only on demand‘ strategy and hence supports laziness. The intermediate operations are lazy and their internal processing model is optimized to make it capable of processing a large amount of data with high performance.

The laziness is achieved by executing the stream only when the terminal operation is invoked. Let us take the aforementioned code and apply the following terminal operation to it:

1
2
3
4
5
List<Integer> myHashedPasswords = Stream.of("passswd", "pass", "assword")
        .peek(value -> System.out.println("About to hash: " + value))
        .map(String::hashCode)
        .peek(n -> System.out.println("Done hashing, result: " + n))
        .collect(Collectors.toList());

Then, 6 lines will be printed into stdout:

1
2
3
4
5
6
About to hash: passswd
Done hashing, result: -792030001
About to hash: passw0rd
Done hashing, result: 1216925212
About to hash: assword
Done hashing, result: -704243445

Terminal operation is an operation that produces a non-stream result such as primitive value, a collection or no value at all. Here is the list of all such operations: toArray(), collect(),count(), reduce(), forEach(), forEachOrdered(), min(), max(), anyMatch(), allMatch(), noneMatch(), findAny(), findFirst().

Short-Circuiting

Short-circuit operations are the same as Boolean short-circuiting, where the second condition is executed only if the first argument does not suffice to determine the value of the expression. For example:

1
if(input != null && input.isEmpty())

In this conditional statement, if input is null then only the first condition will be executed. More or less the same technique of short-circuiting is used in Java streams as well. Not all short-circuit operations are terminal and cannot trigger stream execution. Here’s a list of terminal short-circuiting operations: findFirst(), findAny(), anyMatch(), allMatch() and noneMatch().

Let’s add a short-circuit to our stream and examine it’s behaviour:

1
2
3
4
5
Stream<Integer> myHashedPasswords = Stream.of("passswd", "pass", "assword")
        .peek(value -> System.out.println("About to hash: " + value))
        .map(String::hashCode)
        .peek(n -> System.out.println("Done hashing, result: " + n))
        .limit(2);

Question: how many lines will this code print into stdout if executed?

If you think that an answer is 4 (2 lines for each element), then you have not been paying attention. The correct answer would be 0 because limit() is not a terminal operation.

Morale

  1. Do not consider yourself as an expert in anything!
  2. It is always worth reading documentation of well-known components (docs for java.util.stream).
  3. Do not use peek() for anything than debugging.

Credits

This post’s image was created by @randomspirits.Thank you!