Please note that the specifications and other information contained herein are not final and are subject to change. The information is being made available to you solely for purpose of evaluation.
Java™ Platform
Standard Ed. 8

DRAFT ea-b00

Package java.util.stream

Classes to support functional-style operations on streams of values, as in the following:

See: Description

Package java.util.stream Description

Classes to support functional-style operations on streams of values, as in the following:
     int sumOfWeights = blocks.stream().filter(b -> b.getColor() == RED)
                                       .map(b -> b.getWeight())
                                       .sum();
 

Here we use blocks, which might be a Collection, as a source for a stream, and then perform a filter-map-reduce on the stream to obtain the sum of the weights of the red blocks.

The key abstraction used in this approach is Stream, as well as its primitive specializations IntStream, LongStream, and DoubleStream. Streams differ from Collections in several ways:

Stream pipelines

Streams are used to create pipelines of operations. A complete stream pipeline has several components: a source (which may be a Collection, an array, a generator function, or an IO channel); zero or more intermediate operations such as Stream#filter or Stream#map; and a terminal operation such as Stream#forEach or Stream#reduce. Stream operations may take as parameters function values (which are often lambda expressions, but could be method references or objects) which parameterize the behavior of the operation, such as a Predicate passed to the Stream#filter method.

Intermediate operations return a new Stream. They are lazy; executing an intermediate operation such as Stream#filter does not actually perform any filtering, instead creating a new Stream that, when traversed, contains the elements of the initial Stream that match the given Predicate. Consuming elements from the stream source does not begin until the terminal operation is executed.

Terminal operations consume the Stream and produce a result or a side-effect. After a terminal operation is performed, the stream can no longer be used.

Stream operations

Stream operations are divided into two categories: intermediate and terminal. An intermediate operation (such as filter or sorted) produces a new Stream; a terminal operation (such as forEach or findFirst) produces a non-Stream result, such as a primitive value or a Collection.

All intermediate operations are lazy, which means that executing a lazy operations does not trigger processing of the stream contents; all processing is deferred until the terminal operation commences. Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and addition can be fused into a single pass, with minimal intermediate state. Laziness also enables us to avoid examining all the data when it is not necessary; for operations such as "find the first string longer than 1000 characters", one need not examine all the input strings, just enough to find one that has the desired characteristics. (This behavior becomes even more important when the input stream is infinite and not merely large.)

Intermediate operations are further divided into stateless and stateful operations. Stateless operations retain no state from previously seen values when processing a new value; examples of stateless intermediate operations include filter and map. Stateful operations may incorporate state from previously seen elements in processing new values; examples of stateful intermediate operations include distict and sorted. Stateful operations may need to process the entire input before producing a result; for example, one cannot produce any results from sorting a stream until one has seen all elements of the stream. As a result, under parallel computation, some pipelines containing stateful intermediate operations have to be executed in multiple passes. Pipelines containing exclusively stateless intermediate operations can be processed in a single pass, whether sequential or parallel.

Further, some operations are deemed short-circuiting operations. An intermediate operation is short-circuiting if, when present with infinite input, may produce a finite stream as a result. A terminal operations is short-circuiting if, when presented with infinite input, may terminate in finite time. (Having a short-circuiting operation is a necessary, but not sufficient, criteria for the processing of an infinite stream to terminate normally in finite time.)

Parallelism

By recasting aggregate operations as a pipeline of operations on a stream of values, many aggregate operations can be more easily parallelized. A Stream can execute either in serial or in parallel, depending on how it was created. The Stream implementations in the JDK take the approach of creating serial streams unless parallelism is explicitly requested. For example, Collection has methods Collection.stream() and Collection.parallelStream(), which produce serial and parallel streams respectively. The set of operations on serial and parallel streams is identical. There are also stream operations Stream#sequential and Stream#parallel to convert between sequential and parallel execution. To execute the "sum of weights of blocks" query in parallel, we would do:

     int sumOfWeights = blocks.parallelStream().filter(b -> b.getColor() == RED)
                                               .map(b -> b.getWeight())
                                               .sum();
 

The only difference between the serial and parallel code is the creation of the initial Stream. Whether a Stream is serial or parallel can be determined by the Stream#isParallel method.

In order for the results of parallel operations to be deterministic and consistent with their serial equivalent, it is important to ensure that the function values passed into the various stream operations be non-interfering.

Ordering

Streams may or may not have an encounter order. Whether or not there is an encounter order depends on the source, the intermediate operations, and the terminal operation. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not. Some intermediate operations may impose an encounter order on an otherwise unordered stream, such as Stream#sorted. Finally, some terminal operations may ignore encounter order, such as Stream#forEach, and others may have optimized implementations for the case where there is no defined encounter order.

If a Stream is ordered, certain operations are constrained to operate on the elements in their encounter order. If the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, than any permutation of the values [2, 4, 6] would be a valid result.

Non-interference

Java™ Platform
Standard Ed. 8

DRAFT ea-b00

Submit a bug or feature
For further API reference and developer documentation, see Java SE Documentation. That documentation contains more detailed, developer-targeted descriptions, with conceptual overviews, definitions of terms, workarounds, and working code examples.
Copyright © 1993, 2013, Oracle and/or its affiliates. All rights reserved.

DRAFT ea-b00