Find duplicates in two lists List<int[]> with streams
I have a list of int arrays. I need to find duplicates of these arrays using 2
lists of int arrays.
I did try to implement it, but I'm getting an empty array.
keys = [[1,1,0], [0,0,1], [1,2,1], [1,3,1], [1,3,2]];
phaseKey = [[1,3,2], [1,2,1], [0,0,2], [1,2,3], [1,0,3]];
desired result: [[[1,3,2]], [1,2,1]];
My code:
Stream.concat(Stream.of(keys.stream().flatMapToInt(Arrays::stream)),
Stream.of(phaseKey.stream().flatMapToInt(Arrays::stream)))
.collect(Collectors.groupingBy(
Function.identity(),
Collectors.counting()))
.entrySet()
.stream()
.filter(m -> m.getValue() > 1)
.map(Map.Entry::getKey)
.toArray();
CodePudding user response:
Given your two lists:
List<int[]> keys = // ....
List<int[]> phaseKey = //...
You just need to filter to find common arrays in both lists:
List<int[]> duplicates = keys.stream()
.filter(k -> phaseKey.stream().anyMatch(p -> Arrays.equals(p,k)))
.collect(Collectors.toList());
CodePudding user response:
As @Pshemo has already pointed out in the comment, arrays have no proper implementation of the equals()
and hashCode()
methods. Which are crucial if for every object that is going to be utilized as a key in a map.
Arrays in Java are very special. They are not being created by instantiating a class, but they are objects, and they derive all methods from the java.lang.Object
class. Hence, equals()
and hashCode()
that are accessible on with arrays could be used to establish uniqueness, i.e. whether the two arrays are represented by the same object in memory.
In order to determine if the contents of the two array is identical you need to use static methods equals()
and hashCode()
provided by the Arrays
utility class. To be able to make use this functionality you can create a class that wraps over an array implement equals/hashCode
contract base on the array contents.
I've chosen to implement this wrapper as a Java 16 record
to make it leaner, but you can change it to be a regular class.
public record ArrayWrapper(int[] arr) {
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
ArrayWrapper other = (ArrayWrapper) o;
return Arrays.equals(arr, other.arr);
}
@Override
public int hashCode() {
return Arrays.hashCode(arr);
}
}
Now any array wrapped by this record is good to go to be used with hash-based data structures.
Because your example is ambiguous, below I've listed two different implementation.
Selecting arrays that are present in both lists
In order to make the process of checking whether a particular array contained in the phaseKey
list is also present in the keys
list, we create a HashSet
of ArrayWrapper
objects and then perform checks against this set. That will allow addressing this task in a linear time, performing only a single pass though each of the lists.
public static void main(String[] args) {
List<int[]> keys = List.of(new int[]{1, 1, 0}, new int[]{0, 0, 1}, new int[]{1, 2, 1},
new int[]{1, 3, 1}, new int[]{1, 3, 2});
List<int[]> phaseKey = List.of(new int[]{1, 3, 2}, new int[]{1, 2, 1}, new int[]{0, 0, 2},
new int[]{1, 2, 3}, new int[]{1, 0, 3});
Set<ArrayWrapper> wrappedKeys = keys.stream().map(ArrayWrapper::new).collect(Collectors.toSet());
List<int[]> result = phaseKey.stream()
.map(ArrayWrapper::new)
.filter(wrappedKeys::contains)
.distinct() // to ensure that each array will be present in the array only once
.map(ArrayWrapper::arr)
.collect(Collectors.toList()); // toList() with Java 16
result.forEach(arr -> System.out.println(Arrays.toString(arr)));
}
Fetching arrays having a number of occurrences more than one
To find out if a particular array appears more than once in one of the given lists (or in both) we can create an intermediate map Map<ArrayWrapper,Boolean>
by assigning the initial value for any key as false
(not a duplicate), and returning true
from the mergeFunction
that is meant to resolve duplicates.
public static void main(String[] args) {
List<int[]> keys = List.of(new int[]{1, 1, 0}, new int[]{0, 0, 1}, new int[]{1, 2, 1},
new int[]{1, 3, 1}, new int[]{1, 3, 2});
List<int[]> phaseKey = List.of(new int[]{1, 3, 2}, new int[]{1, 2, 1}, new int[]{0, 0, 2},
new int[]{1, 2, 3}, new int[]{1, 0, 3});
List<int[]> result = Stream.of(keys, phaseKey)
.flatMap(List::stream)
.map(ArrayWrapper::new)
.collect(Collectors.toMap( // creates an intermediate map Map<ArrayWrapper, Boolean>
Function.identity(),
next -> false, // first occurrence
(left, right) -> true)) // all subsequent occurrences
.entrySet().stream()
.filter(Map.Entry::getValue)
.map(Map.Entry::getKey)
.map(ArrayWrapper::arr)
.collect(Collectors.toList()); // toList() with Java 16
result.forEach(arr -> System.out.println(Arrays.toString(arr)));
}
Output
[1, 3, 2]
[1, 2, 1]