Home > other >  Understanding Java stream takeWhile
Understanding Java stream takeWhile

Time:10-13

I have a piece of code and I'm not sure why it's working:

    public static void main(String[] args) {
        List<String> ppl = List.of("Bill", "Bob", "Jennifer", "Ben");
        List<String> newPpl = new ArrayList<>();
        AtomicBoolean isJenniferReached = new AtomicBoolean(false);

        ppl.stream()
                .takeWhile(person -> !isJenniferReached.get())
                .forEach(person -> {
                    newPpl.add(person   " 1");
                    if(person.equals("Jennifer")) {
                        isJenniferReached.set(true);
                    }
                });
        System.out.println(newPpl);
    }

the result is:

[Bill 1, Bob 1, Jennifer 1]

From my understanding, each member of the list pass the entire intermediate pipeline and after all members finished, the terminal operation is being executed on them.

if this is the case - how come this method works? since the boolean is being set to true only in the terminal operation I would expect it to be too late and all members will be processed.

But since this isn't the case, can you please help me understand what's going on?

CodePudding user response:

The behavior you've observed is correct.

For the condition !isJenniferReached.get() to be evaluated to true, firstly, string "Jennifer" should be processed by forEach operation (and it would be added to the list newPpl), and only then when dealing with the next stream element takeWhile() will terminate the execution of the pipeline.

Note

  • That the code you've listed unnecessarily operates via side-effects, which discouraged by the Stream API documentation.
  • Classes from the package java.util.concurrent.atomic are meant to be utilized in a multitiered environment, not for hacky tricks serving as a substation of a boolean flag, or int counter-variable. They act slower and more importantly make the code much more convoluted than it could be.

That how the logic "take strings from the source while Jennifer is not encountered" can be expressed in a more simple manner without side-effects.

List<String> newPpl = ppl.stream()
    .takeWhile(person -> !person.equals("Jennifer"))
    .map(person -> person   " 1")
    .toList(); // for Java 16  or collect(Collectors.toList())

Output:

[Bill 1, Bob 1]

CodePudding user response:

From my understanding, each member of the list pass the entire intermediate pipeline: your understanding is wrong.

It is the terminal operation that requests an item from the previous step and this requests ultimately leads to fetching an item from the source and processing it in the intermediate steps.

The is summarized in the Stream JavaDoc:

  • No storage. A stream is not a data structure that stores elements; [...]
  • Laziness-seeking. Many stream operations, such as filtering, mapping, or duplicate removal, can be implemented lazily, exposing opportunities for optimization. For example, "find the first String with three consecutive vowels" need not examine all the input strings. Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy.
  • Possibly unbounded. While collections have a finite size, streams need not. [...]

Note that the last point is only possible if streams are evaluated lazily - you cannot store all the intermediate results of an unbounded stream in memory.

  • Related