Parallel Streams in Java – Explanation & Advantages

 

parallel stream splits the data source into chunks and processes them in parallel using the Fork/Join framework (the common pool). It’s a drop-in alternative to sequential streams for CPU-bound work on sufficiently large data sets.

How to create a Parallel Stream

// From a collection
list.parallelStream()
    .map(this::expensiveComputation)
    .reduce(0, Integer::sum);

// Toggle within a pipeline
list.stream().parallel()    // make parallel
    .map(...)
    .sequential()           // switch back to sequential if needed
    .collect(Collectors.toList());

// Primitive streams (best for math-heavy workloads)
long sum = java.util.stream.LongStream.range(0, 10_000_000)
    .parallel()
    .map(x -> x * x)
    .sum();

When parallel streams shine

Condition Why
CPU-bound, expensive operations Each element keeps cores busy; overhead gets amortized.
Large, easily splittable data sources (arrays, ArrayList, IntStream.range) Efficient spliterators provide balanced chunks.
Order doesn’t matter Dropping encounter order (unordered()) can speed up some operations.

Advantages

  • Simple parallelism: just call parallel() / parallelStream(), no manual thread code.
  • Scales with cores: uses a shared ForkJoin common pool sized to available processors.
  • Works with reductions/collectors: built-ins like sum(), Collectors.groupingByConcurrent can aggregate in parallel.
  • Composable: same stream API; you can switch between sequential and parallel in a pipeline.

Common pitfalls (& how to avoid them)

  • Small tasks / small data: parallel overhead can make it slower. Rule Prefer parallel only when per-element work is “heavy” or N is large.
  • I/O or blocking calls: parallel streams use the common pool; blocking starves threads. Use reactive I/O or a dedicated executor instead.
  • Side effects & non-thread-safe state: keep operations stateless, non-interfering, side-effect-free. Avoid mutating shared collections.
  • Ordering: forEach is unordered in parallel. Use forEachOrdered (slower) if you must preserve encounter order.
  • Associativity required: reduce and collect must use associative / thread-safe logic for correctness.
  • Data source matters: ArrayList/arrays split well; LinkedList and iterate() don’t (poor splitting leads to worse performance).

Correct patterns

1) Parallel frequency count (concurrent map)

import static java.util.stream.Collectors.*;

Map<String, Long> freq =
    words.parallelStream()
         .collect(groupingByConcurrent(w -> w, counting())); // concurrent collector

2) Parallel, order-agnostic processing

list.parallelStream()
    .unordered()                 // hint: order not required
    .map(this::heavy)
    .forEach(x -> sink.accept(x)); // OK if sink is thread-safe

3) Safe reduction (associative)

int sum = nums.parallelStream()
              .reduce(0, Integer::sum); // associative, safe

4) Custom pool (advanced)

By default, parallel streams use the common ForkJoinPool. To isolate from other parallel tasks, run the stream inside your own pool:

var pool = new java.util.concurrent.ForkJoinPool(8);
List<Result> results = pool.submit(() ->
    items.parallelStream().map(this::heavy).toList()
).join(); // stream inside this task uses this pool
pool.shutdown();

Performance tips

  • Prefer primitive streams (IntStream, LongStream) to avoid boxing overhead.
  • Use mapToInt/mapToLong where possible.
  • Remove ordering when not needed (unordered()).
  • Benchmark with realistic data (e.g., JMH); don’t assume parallel is faster.

Quick decision guide

  • CPU-heavy + big data + splittable source? ✅ Try parallel streams.
  • I/O-bound or tiny tasks? ❌ Stick to sequential or use async I/O.
  • Needs strict order? ⚠️ Use sequential or forEachOrdered (accept perf hit).
Back to blog

Leave a comment

Please note, comments need to be approved before they are published.