Home > Enterprise >  Java stream collect check if result would contain element
Java stream collect check if result would contain element

Time:05-17

As I couldn't find anything related to this, I am wondering if streams even allow this.

In my answer to another question, I have following code to add elements to a result list, only if the result list doesn't already contain it:

List<Entry<List<Integer>, Integer>> list = new ArrayList<>(diffMap.entrySet());
list.sort(Entry.comparingByValue());
List<List<Integer>> resultList = new ArrayList<>();
for (Entry<List<Integer>, Integer> entry2 : list) {
    if (!checkResultContainsElement(resultList, entry2.getKey()))
        resultList.add(entry2.getKey());
}

checkResultContainsElement method:

private static boolean checkResultContainsElement(List<List<Integer>> resultList, List<Integer> key) {
    List<Integer> vals = resultList.stream().flatMap(e -> e.stream().map(e2 -> e2))
            .collect(Collectors.toList());
    return key.stream().map(e -> e).anyMatch(e -> vals.contains(e));
}

Now I am wondering, if this for-loop:

for (Entry<List<Integer>, Integer> entry2 : list) {
    if (!checkResultContainsElement(resultList, entry2.getKey()))
        resultList.add(entry2.getKey());
}

can be realized using streams. I don't think that .filter() method would work, as it would remove data from List<Entry<List<Integer>, Integer>> list while I don't even know if an element should be considered. I guess that a custom collector could work, but I also wouldn't know how to implement one, as the result is constantly changing with each newly added element.

I am looking for something like this (can be different if something else is better):

list.stream().sorted(Entry.comparingByValue()).collect(???);

where ??? would filter the data and return it as a list.


The values of one result list may not be contained in another one. So these lists are valid:

[1, 2, 3, 4]
[5, 6, 7, 8]
[12, 12, 12, 12]

but of these, only the first is valid:

[1, 2, 3, 4] <-- valid
[5, 3, 7, 8] <-- invalid: 3 already exists
[12, 12, 2, 12] <-- invalid: 2 already exists

CodePudding user response:

May be something like this:-

 list.stream().
sorted(Entry.comparingByValue()).
collect(ArrayList<List<Foo>>::new,(x,y)->!checkResultContainsElement(x, y.getKey()),(x,y)->x.add(y.getKey()));

CodePudding user response:

If we put aside for a moment the details on whether implementation will be stream-based or not, the existing implementation of how uniqueness of the values of incoming lists is being checked can be improved.

We can gain a significant performance improvement by maintaining a Set of previously encountered values.

I.e. values from each list that was added to the resulting list would be stored in a set. And in order to ensure uniqueness of every incoming list, its values would be checked against the set.

Since operations of a stream pipeline should be stateless, as well as collector shouldn't hold a state (i.e. changes should happen only inside its mutable container). We can approach this problem by defining a container that will encompass a resulting list of lists of Foo and a set of foo-values.

I've implemented this container as a Java 16 record:

public record FooContainer(Set<Integer> fooValues, List<List<Foo>> foosList) {
    public void tryAdd(List<Foo> foos) {
        if (!hasValue(foos)) {
            foos.forEach(foo -> fooValues.add(foo.getValue()));
            foosList.add(foos);
        }
    }
    
    public boolean hasValue(List<Foo> foos) {
        return foos.stream().map(Foo::getValue).anyMatch(fooValues::contains);
    }
}

The record shown above would is used as a mutable container of a custom collector created with Colloctors.of(). Collector's accumulator make's use of tryAdd() method defined by the container. And the finisher extracts the resulting list from the container.

Note that this operation is not parallelizable, hence combiner of the collector throws an AssertionError.

public static void main(String[] args) {
    Map<List<Foo>, Integer> diffMap =
        Map.of(List.of(new Foo(1), new Foo(2), new Foo(3)), 1,
               List.of(new Foo(1), new Foo(4), new Foo(5)), 2,
               List.of(new Foo(7), new Foo(8), new Foo(9)), 3);
    
    List<List<Foo>> result = diffMap.entrySet().stream()
        .sorted(Map.Entry.comparingByValue())
        .map(Map.Entry::getKey)
        .collect(Collector.of(
            () -> new FooContainer(new HashSet<>(), new ArrayList<>()),
            FooContainer::tryAdd,
            (left, right) -> {throw new AssertionError("The operation isn't parallelizable");},
            FooContainer::foosList
        ));

    System.out.println(result);
}

Output:

[[Foo{1}, Foo{2}, Foo{3}], [Foo{7}, Foo{8}, Foo{9}]]
  • Related