How to fix Duplicate Key IllegalStateException while using Collectors.toMap()-CodePudding

I have a stream that processes some strings and collects them in a map.

But getting the following exception:

java.lang.IllegalStateException:
Duplicate key [email protected]
(attempted merging values [[email protected]] and [[email protected]])
at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133)

I'm using the following code:

Map<String, List<String>> map = emails.stream()
    .collect(Collectors.toMap(
        Function.identity(),
        email -> processEmails(email)
    ));

CodePudding user response：

You have duplicate emails. The toMap version you're using explicitly doesn't allow duplicate keys. Use the toMap that takes a merge function. How to merge those processEmails results depends on your business logic.

Alternatively, use distinct() before collecting, because otherwise you'll probably end up sending some people multiple emails.

CodePudding user response：

try using

Collectors.toMap(Function keyFuntion, Function valueFunction, BinaryOperator mergeFunction)

You obviously have to write your own merge logic, a simple mergeFunction could be

(x1, x2) -> x1

CodePudding user response：

The flavor of toMap() you're using in your code (which expects only keyMapper and valueMapper) disallow duplicates merely because it's not capable to handle them. And exception message explicitly tells you that.

Judging by the resulting type Map<String, List<String>> and by the exception message which shows strings enclosed in square brackets, it is possible to make the conclusion that processEmails(email) produces a List<String> (although it's not obvious from your description and IMO worth specifying).

There are multiple ways to solve this problem, you can either:

Use this another version of toMap(keyMapper,valueMapper,mergeFunction) which requires the third argument, mergeFunction - a function responsible for resolving duplicates.

Map<String, List<String>> map = emails.stream()
    .collect(Collectors.toMap(
        Function.identity(),
        email -> processEmails(email),
        (list1, list2) -> list1 // or { list1.addAll(list2); return list1} depending on the your logic of resolving duplicates you need
    ));

Make use of the collector groupingBy(classifier,downstream) to preserve all the emails retrieved by processEmails() that are associated with the same key by storing them into a List. As a downstream collector we could utilize a combination of collectors flatMapping() and toList().

Map<String, List<String>> map = emails.stream()
            .collect(Collectors.groupingBy(
                Function.identity(),
                Collectors.flatMapping(email -> processEmails(email).stream(),
                    Collectors.toList())
            ));

Note that the later option would make sense only if processEmails() somehow generates different results for the same key, otherwise you would end up with a list of repeated values which doesn't seem to be useful.

But what you definitely shouldn't do in this case is to use distinct(). It'll unnecessarily increase the memory consumption because it eliminates the duplicates by maintaining a LinkedHashSet under the hood. It would be wasteful because you're already using Map which is capable to deal with duplicated keys.