I’m trying to obtain a only duplicated numbers list from a list of integers:
final Set<Integer> setOfNmums = new HashSet<>();
Arrays.asList(5,6,7,7,7,6,2,4,2,4).stream()
.peek(integer -> System.out.println("XX -> " integer))
.filter(n -> !setOfNmums.add(n))
.peek(System.out::println)
.map(String::valueOf)
.sorted()
.collect(Collectors.toList());
The output is 2,4,6,7,7
Expected : 2,4,6,7
I don’t understand how that’s happening.. is this running in parallel? how am I getting two '7'?
The hashset should return false if it exists and that used by the filter?
Yes I can use distinct, but I’m curious to know why would the filter fail.. is it being done in parallel?
CodePudding user response:
Your filter rejects the first occurrence of each element and accepts all subsequent occurrences. Therefore, when an element occurs n times, you’ll add it n-1 times.
Since you want to accept all elements which occur more than once, but only accept them a single time, you could use .filter(n -> !setOfNmums.add(n)) .distinct()
or you enhance the set to a map, to be able to accept an element only on its second occurrence.
Map<Integer, Integer> occurrences = new HashMap<>();
List<String> result = Stream.of(5,6,7,7,7,6,2,4,2,4)
.filter(n -> occurrences.merge(n, 1, Integer::sum) == 2)
.map(String::valueOf)
.sorted()
.collect(Collectors.toList());
But generally, using stateful filters with streams is discouraged.
A cleaner solution would be
List<String> result = Stream.of(5,6,7,7,7,6,2,4,2,4)
.collect(Collectors.collectingAndThen(
Collectors.toMap(String::valueOf, x -> true, (a,b) -> false, TreeMap::new),
map -> { map.values().removeIf(b -> b); return new ArrayList<>(map.keySet()); }));
Note that this approach doesn’t count the occurrences but only remembers whether an element is unique or has seen at least a second time. This works by mapping each element to true with the second argument to the toMap
collector, x -> true
, and resolving multiple occurrences with a merge function of (a,b) -> false
. The subsequent map.values().removeIf(b -> b)
will remove all unique elements, i.e. those mapped to true
.
CodePudding user response:
You can use .distinct() function in your stream check this out.
CodePudding user response:
Since Holger already explained why your solution didn't work, I'll just provide an alternative.
Why not use Collections.frequency(collection, element)
together with distinct()
?
The solution would be quite simple(i apologize for the formatting, i just copied it from my ide and there doesn't seem to be an autoformat feature in SOF):
List<Integer> numbers = List.of(5, 6, 7, 7, 7, 6, 2, 4, 2, 4);
List<String> onlyDuplicates = numbers.stream()
.filter(n -> Collections.frequency(numbers, n) > 1)
.distinct()
.sorted()
.map(String::valueOf)
.toList();
This simply keeps all elements that occur more than once and then filters out the duplicates before sorting, converting each element to a string and collecting to a list since that seems to be what you want.
if you need a mutable list you can use collect(toCollection(ArrayList::new))
instead of toList()