Home > OS >  Group strings into multiple groups when using stream groupingBy
Group strings into multiple groups when using stream groupingBy

Time:07-27

A simplified example of what I am trying to do:

Suppose I have a list of strings, which need to be grouped into 4 groups according to a condition if a specific substring is contained or not. If a string contains Foo it should fall in the group FOO, if it contains Bar it should fall in the group BAR, if it contains both it should appear in both groups.

List<String> strings = List.of("Foo", "FooBar", "FooBarBaz", "XXX");

A naive approach for the above input doesn't work as expected since the string is grouped into the first matching group:

Map<String,List<String>> result1 =
strings.stream()
        .collect(Collectors.groupingBy(
                        str -> str.contains("Foo") ? "FOO" :
                                    str.contains("Bar") ? "BAR" :
                                            str.contains("Baz") ? "BAZ" : "DEFAULT"));

result1 is

{FOO=[Foo, FooBar, FooBarBaz], DEFAULT=[XXX]}

where as the desired result should be

{FOO=[Foo, FooBar, FooBarBaz], BAR=[FooBar, FooBarBaz], BAZ=[FooBarBaz], DEFAULT=[XXX]}

After searching for a while I found another approach, which comes near to my desired result, but not quite fully

Map<String,List<String>> result2 =
List.of("Foo", "Bar", "Baz", "Default").stream()
        .flatMap(str -> strings.stream().filter(s -> s.contains(str)).map(s -> new String[]{str.toUpperCase(), s}))
        .collect(Collectors.groupingBy(arr -> arr[0], Collectors.mapping(arr -> arr[1], Collectors.toList())));

System.out.println(result2);

result2 is

{BAR=[FooBar, FooBarBaz], FOO=[Foo, FooBar, FooBarBaz], BAZ=[FooBarBaz]}

while this correctly groups strings containing the substrings into the needed groups, the strings which doesn't contain the substrings and therefore should fall in the default group are ignored. The desired result is as already mentioned above (order doesn't matter)

{BAR=[FooBar, FooBarBaz], FOO=[Foo, FooBar, FooBarBaz], BAZ=[FooBarBaz], DEFAULT=[XXX]}

For now I'm using both result maps and doing an extra:

result2.put("DEFAULT", result1.get("DEFAULT"));

Can the above be done in one step? Is there a better approach better than what I have above?

CodePudding user response:

Instead of overeating with string "Foo", "Bar", etc. and their corresponding uppercase versions, it would be more convenient and much cleaner to define an enum.

Let's call it Keys:

public enum Keys {
    FOO("Foo"), BAR("Bar"), BAZ("Baz"), DEFAULT("");
    
    private static final Set<Keys> values = EnumSet.range(FOO, BAZ); // Set of enum constants (not includes DEFAULT), needed to avoid creating EnumSet array via `values()` of enum members at every invocation of getKeys()
    private String keyName;
    
    Keys(String keyName) {
        this.keyName = keyName;
    }
    
    public static List<String> getKeys(String str) {
        List<String> keys = values.stream()
            .filter(key -> str.contains(key.keyName))
            .map(Enum::name)
            .toList();
        
        return keys.isEmpty() ? List.of(DEFAULT.name()) : keys;
    }
}

It has a method getKeys(String) which expects a string and returns a list of keys to which the given string should be mapped.

By using the functionality encapsulated in the Keys enum we can create a map of string split into groups which correspond to names of Keys-constants by using collect(supplier,accumulator,combiner).

main()

public static void main(String[] args) {
    List<String> strings = List.of("Foo", "FooBar", "FooBarBaz", "XXX");

    Map<String, List<String>> stringsByGroup = strings.stream()
        .collect(
            HashMap::new,
            (Map<String, List<String>> map, String next) -> Keys.getKeys(next)
                .forEach(key -> map.computeIfAbsent(key, k -> new ArrayList<>()).add(next)),
            (left, right) -> right.forEach((k, v) ->
                left.merge(k, v, (oldV, newV) -> { oldV.addAll(newV); return oldV; })
        ));
    
    stringsByGroup.forEach((k, v) -> System.out.println(k   " -> "   v));
}

Output:

BAR -> [FooBar, FooBarBaz]
FOO -> [Foo, FooBar, FooBarBaz]
BAZ -> [FooBarBaz]
DEFAULT -> [XXX]

A link to Online Demo

CodePudding user response:

This is ideal for using mapMulti.

Map<String, List<String>> result = strings.stream()
        .<String[]>mapMulti((str, consumer) -> {
            boolean found = false;
            for (String token : List.of("FOO", "BAR",
                    "BAZ")) {
                if (str.toUpperCase().contains(token)) {
                    consumer.accept(
                            new String[] { token, str });
                    found = true;
                }
            }
            if (!found) {
                consumer.accept(
                        new String[] { "DEFAULT", str });
            }
        })
        .collect(Collectors.groupingBy(arr -> arr[0],
                Collectors.mapping(arr -> arr[1],
                        Collectors.toList())));

result.entrySet().forEach(System.out::println);

prints

BAR=[FooBar, FooBarBaz]
FOO=[Foo, FooBar, FooBarBaz]
BAZ=[FooBarBaz]
DEFAULT=[XXX, YYY]
  • Related