Home > Net >  Difference between "parallelStream()" and "asSequence().asStream().parallel()"
Difference between "parallelStream()" and "asSequence().asStream().parallel()"

Time:03-21

Using kotlin 1.6.10 it seems that parallelStream() and asSequence().asStream().parallel() are not behaving the same. The latter doesn't seem to parallelize. Here is an example code snippet, demonstrating the issue:

import java.util.stream.Collectors.toList
import kotlin.streams.asStream
import java.util.Calendar

fun main() {
    println("Available processors: ${Runtime.getRuntime().availableProcessors()}")

    val numbers = (10 downTo 0).toList()
    
    println("Using parallelStream:")
    val start1=System.currentTimeMillis()
    numbers.parallelStream().map { println("${it} - ${Thread.currentThread().getId()}"); Thread.sleep(100); it }.collect(toList<Int>())
    println("Execution time: ${System.currentTimeMillis()-start1}")
    
    println("Using asSequence().asStream().parallel():")
    val start2=System.currentTimeMillis()
    numbers.asSequence().asStream().parallel().map { println("${it} - ${Thread.currentThread().getId()}"); Thread.sleep(100); it }.collect(toList<Int>())
    println("Execution time: ${System.currentTimeMillis()-start2}")
}

You can execute it in the kotlin playground: https://pl.kotl.in/z3sxKHKvL

The output is

Available processors: 2
Using parallelStream:
4 - 1
8 - 10
3 - 1
7 - 10
5 - 1
6 - 10
1 - 1
10 - 10
0 - 1
9 - 10
2 - 1
Execution time: 617
Using asSequence().asStream().parallel():
10 - 1
9 - 1
8 - 1
7 - 1
6 - 1
5 - 1
4 - 1
3 - 1
2 - 1
1 - 1
0 - 1
Execution time: 1105

I don't understand the reason for this difference in behavior.

CodePudding user response:

Calling stream() or parallelStream() directly on your List means under the hood it is able to call spliterator() on the underlying List directly and get a functional Spliterator that can break up the work in parallel.

If you convert the List to a Sequence, you no longer have direct access to the List when calling asStream() on it. Calling asStream() on the Sequence creates a Stream with an ad hoc Spliterator that doesn't have access to the size of the collection (because Sequences do not have a size property), and so it cannot break up the work for parallel execution.

Basically, asSequence() strips away size information and the built-in Spliterator implementation of List.

  • Related