Saturday, July 25, 2015

Java 8 Streams


Introduction

Every application create and process collections. In Java until recently if you want to do some "finding" or "grouping" on collections you must code it yourself. It was not very exciting and it is repetitive job in nature. groovy for example offers great tools for transforming and managing collection. Check this link for some great examples. Java 8 borrows some concepts from groovy, but also go one step forward with multi core processing and stream concepts.

In SQL you don't need to implement how to calculate grouping or something else, you just describe your expectation (what you want to have).  Stream API in Java 8 is guided with same philosophy.

What is stream?

Stream is basically a sequence of elements from a source that supports aggregate operations. Let's break this statement:
  • Sequence of elements: Stream provides an interface to a sequenced set of values. Implementation of this interface don't store values, values are calculated on run-time.
  • Source: This is where are values are stored. Collection, arrays, I/O.
  • Aggregate operations: All common SQL-like (group, count, sum) and function programming languages constructions (filter, map, reduce, find, match, sorted).
Streams also have to fundamental characteristics:
  • Pipelining: This allows operation on stream to be chained into large pipeline. 
  • Internal iteration: Collections are iterated externally (explicit iteration), stream do the iteration behind the scenes.
Streams are not collections! In a nutshell, collections are about data and streams are about computations. The difference between collections and streams has to do with when things are computed. Every element in the collection has to be computed before it can be added to the collection. In contrast, a stream is a conceptually fixed data structure in which elements are computed on demand. For example in following example no work is actually done until collect is invoked:
List numbers = Arrays.asList(1, 4, 1, 4, 2, 8, 5);
List distinct = numbers.stream().map( i -> i*i).
      distinct().collect(Collectors.toList());
System.out.printf("integers: %s, squares : %s %n", numbers, distinct);
There are two types of stream operations:
  • Intermediate: can be connected together because their return type is a Stream.
  • Terminal: this kind of operation produce a result from a pipeline such as a List, an Integer, or even void (any non-Stream type).
Intermediate operations do not perform any processing until a terminal operation is invoked on the stream pipeline; they are “lazy.”

Streams also use short-circuiting where we need to process only part of the stream, not all of it, to return a result. This is similar to evaluating a large Boolean expression chained with the and operator.

This was just high level overview W/O detailed examples of stream API. It is easy to find examples on other sources like here. In my opinion Stream API is great and refreshing new feature in Java 8, especially with it's lazy, short-circuit multi core features.

No comments:

Post a Comment