I want to get something like this below:
//Input:
{
"paths": [
[
"A","B"
],
[
"A","B","C"
],
[
"A","C"
],
[
"A","D"
],
[
"A"
]
]
}
//Output:
{
"paths": [
[
"A","B","C"
],
[
"A","D"
]
]
}
What my code looks like:
List<LinkedList<String>> paths = new LinkedList<>();
//...
//some process that adds elements to paths.
paths.sort(Comparator.comparingInt(LinkedList::size));
List<LinkedList<String>> uniquePaths = new ArrayList<>();
for (int i = 0; i < paths.size(); i ) {
boolean unique=true;
for (int j = i 1 ; j < paths.size(); j ) {
if (paths.get(j).containsAll(paths.get(i))){
unique=false;
break;
}
}
if(unique) {
uniquePaths.add(paths.get(i));
}
}
But this logic seems a bit off.(This is resulting in something like: [[A][A,D][A,B,C]]
) Please help with some better way of solving this.
Similar Question but in R: How to reduce (subset) list of lists?
CodePudding user response:
You're not checking each combination against the whole data set. Your mistake is rooted in the nested loop, where iteration starts from the index j=i 1
.
Since performance of the containsAll()
depends on the collection that is passed as an argument, it makes sense to generate a List of Sets out of the given list of list.
Here's how it can be implemented using Stream API.
List<List<String>> source = List.of(
List.of("A", "B"), List.of("A", "B", "C"),
List.of("A", "C"), List.of("A", "D"), List.of("A")
);
List<Set<String>> sets = source.stream()
.map(HashSet::new)
.collect(Collectors.toList());
List<List<String>> result = sets.stream()
.filter(set -> sets.stream().filter(s -> !s.equals(set)).noneMatch(s -> s.containsAll(set)))
.map(ArrayList::new)
.collect(Collectors.toList());
System.out.println(result);
Output:
[[A, B, C], [A, D]]
Note that we need to exclude the current combination of strings while checking whether it is a subset of any of the other elements or not, see filter(s -> !s.equals(set))
in the code above.