Home > Blockchain >  Remove items from a Set that are present in a List - Streams
Remove items from a Set that are present in a List - Streams

Time:05-31

I am trying to remove some objects from a Set (HashSet), but only if they are also present into a List (LinkedList). How can I achieve this with Java 8 features (streams).

Set<MyObject> theSet -> want to remove items that are present in the list

List<MyObject> theList

I have overriden the equals and hashcode for MyObject (use only a few fields for comparison of equality).

CodePudding user response:

If your set is mutable, you could just remove all entries present in your list from the set like so:

theSet.removeAll(theList);

Suppose your set is immutable, then you can filter it with a stream resulting in a new set missing the entries present in the list like so:

var newSet = theSet.stream()
    .filter(n -> !theList.contains(n))
    .collect(Collectors.toSet());

For Java 11 you could use a method reference for the filter predicate via composition:

var newSet = theSet.stream()
    .filter(Predicate.not(theList::contains))
    .collect(Collectors.toSet());

Little note on performance

Both approaches (removeAll and via streams) run in O(N * M) where N is the size of theSet and M the size of theList. It boils down to two nested for-loops.

An easy enhancement would be to turn theList into a set and driving the cost of the linear contains check down to an asymptotic runtime of O(1).

var numsToExclude = new HashSet<>(theList);
var newSet = theSet.stream()
    .filter(Predicate.not(numsToExclude::contains))
    .collect(Collectors.toSet());

CodePudding user response:

You can take a look at the removeIf() method from the Collection interface. So you can say something like:

theSet.removeIf(theList::contains)

It will also return a boolean that indicates whether any elements were removed or not.

CodePudding user response:

I doubt a stream is as fast as a simple loop consider the streams overhead. You are always going have to iterate over the entire list so I would do it like so. This presumes the original set is immutable.

List<Integer> list = List.of(1,2,5,8,9,10);
Set<Integer> set = Set.of(3,4,8,2,1);
Set<Integer> result = new HashSet<>(set);
        
for(int val : list) {
     result.remove(val);
}
System.out.println("Before: "   set);
System.out.println("After:  "   result);

prints

Before: [1, 8, 4, 3, 2]
After:  [3, 4]

Since Sets can't hold duplicates, encountering duplicates in the removal list won't affect the outcome. So if you could gather those in a Set rather than a List it might offer some improvement.

Finally, your Object to be removed must override equals and hashCode for the above to work.

  • Related