Java 8’s ForkJoinPool & parallel streams are receiving some serious attention now. They seem a sophisticated concept to parallelize work, but some are warning of serious limitations.
Edward Harned of CoopSoft, in particular, has analyzed the Fork/Join design implemented in Java.
He documents numerous problems with the suitability, efficiency & scalability of the “work stealing” algorithm it implements.
Key problems include:
- work-stealing causes high contention
- work-stealing is inefficient at parallelizing jobs
- I/O, long-running or blocking tasks do not schedule efficiently
- “recursive decomposition” overflows with uneven structures
- submission queues become a bottleneck
- framework may spawn hundreds of “continuation” threads
- tasks may run on unexpected threads
- ForkJoinPool is unreliable in EJB containers
There are also severe reservations about the efficiency of scheduler, and the handling of blocking tasks.
Essentially, his position is that Fork/Join is not “Parallel Processing”; instead, it’s a narrow and specialized technique, not applicable for general applications programming but only to a narrow category of “recursive decomposition” tasks.
We should consider our requirements carefully, and remember not to ignore alternatives such as ThreadPoolExecutor.
Are you experimenting with parallel streams? Would these problems affect your applications? Add your comment now.
See:
– Edward Harned: A Java Form-Join Calamity
– Pierre-yves Saumont: What’s Wrong in Java 8, Parallel Streams
– Stack Overflow: Performance problems using new Fork Join framework
Carsten Habicht asked a very good question:
But if we read the JDK’s own provided examples for ForkJoin usage, we may be able to make our own mind up.
Oracle’s ‘RecursiveTask’ example, they themselves describes as a “dumb” way to calculate Fibonacci & a linear algorithm would be faster in practice.
Oracle’s RecursiveAction ‘sumOfSquares’ example introduces both manual scheduling tweaks, task tracking & execution.. no clean algorithm here.
The only reason I can presume for Oracle to require all these tweaks, would be poor/failed performance without. Such problems even with the documentation examples (!) tend to support Harned’s thesis.
– RecursiveTask Javadoc
– RecursiveAction Javadoc