Home > Back-end >  How to group Objects by property using using collector groupingBy() on top of flatMap() in Java 8
How to group Objects by property using using collector groupingBy() on top of flatMap() in Java 8

Time:09-26

How to create Map<String,List<Product>> of below. Here, String (key of the Map) is the category of a Product.

One product can belong to multiple categories, like in the example below.

I am trying with below code, however not able to get next operation:

products.stream()
    .flatMap(product -> product.getCategories().stream())
    . // how should I progress from here?

Result should be like below:

{electonics=[p1,p3,p4], fashion=[p1,p2,p4], kitchen=[p1,p2,p3], abc1=[p2], xyz1=[p3],pqr1=[p4]}

Product p1 = new Product(123, Arrays.asList("electonics,fashion,kitchen".split(",")));
Product p2 = new Product(123, Arrays.asList("abc1,fashion,kitchen".split(",")));
Product p3 = new Product(123, Arrays.asList("electonics,xyz1,kitchen".split(",")));
Product p4 = new Product(123, Arrays.asList("electonics,fashion,pqr1".split(",")));
List<Product> products = Arrays.asList(p1, p2, p3, p4);
class Product {

    int price;
    List<String> categories;

    public Product(int price) {
        this.price = price;
    }

    public Product(int price, List<String> categories) {
        this.price = price;
        this.categories = categories;
    }

    public int getPrice() {
        return price;
    }

    public List<String> getCategories() {
        return categories;
    }
}

CodePudding user response:

If you want to use collector groupingBy() for some reason, then you can define a wrapper class (with Java 16 a record would be more handy for that purpose) which would hold a reference to a category and a product to represent every combination category/product which exist in the given list.

public record ProductCategory(String category, Product product) {}

Pre-Java 16 alternative:

public class ProductCategory {
    private String category;
    private Product product;
    
    // constructor and getters
}

And then in the make use of the combination of collectors mapping() and toList() as the downstream collector of groupingBy().

List<Product> products = // initializing the list of products
        
Map<String, List<Product>> productsByCategory = products.stream()
    .flatMap(product -> product.getCategories().stream()
        .map(category -> new ProductCategory(category, product)))
    .collect(Collectors.groupingBy(
        ProductCategory::category,                   // ProductCategory::getCategory if you used a class instead of record
        Collectors.mapping(ProductCategory::product, // ProductCategory::getProduct if you used a class instead of record
            Collectors.toList())
    ));

A link to Online-Demo


But instead of creating intermediate objects and generating nested streams, the more performant option would be to describe the accumulation strategy within the three-args version of collect() (or define a custom collector).

That's how it might be implemented:

Map<String, List<Product>> productsByCategory = products.stream()
    .collect(
        HashMap::new,
        (Map<String, List<Product>> map, Product next) -> next.getCategories()
            .forEach(category -> map.computeIfAbsent(category, k -> new ArrayList<>())
                .add(next)),
        (left, right) -> right.forEach((k, v) -> 
            left.merge(k, v,(oldProd, newProd) -> { oldProd.addAll(newProd); return oldProd; }))
    );

A link to Online-Demo

CodePudding user response:

I tried a few things and came up with the following solution:

Map<Object, List<Product>> result =
        products.stream()
                .flatMap(product -> product.getCategories().stream().map(p -> Map.entry(p, product)))
                .collect(Collectors.groupingBy(Map.Entry::getKey, Collectors.mapping(Map.Entry::getValue, Collectors.toList())));


System.out.println(result);

Output:

xyz1=[org.example.Product@15db9742], electonics=[org.example.Product@6d06d69c, org.example.Product@15db9742, org.example.Product@7852e922], abc1=[org.ex ...

Edit: I have seen that my solution is pretty similar to the other answer. However, my solution uses a Map.Entry instead of a user-defined object to bring the data into the correct shape.

CodePudding user response:

This can also be done with a combination of flatMapping and toMap:

Map<String, List<Product>> obj = products.stream()
    .collect(
        Collectors.flatMapping(
            product -> product.categories().stream()
                .map(category -> Map.entry(category, List.of(product))),
            Collectors.toMap(
                Map.Entry::getKey,
                Map.Entry::getValue,
                (v1, v2) -> Stream.concat(v1.stream(), v2.stream()).toList()
            )
    ));

What happens here is that first, each Product is converted to a Map.Entry<String, List<Product>>, where the key is the category and the value is the Product itself, or, more precisely, a List<Product>, where this list initially only contains the current product.

Then you could "unpack" the map entries by using toMap. Of course, for those cases where the key (=category) is the same, the values (that is, the List with the Products) must be merged.


Note: I used a Map.Entry here, but you can also write a custom class which is semantically more desirable (something like CategoryProductsMapping(String category, List<Product> products).

CodePudding user response:

I am aware that the owner inquired about groupby and flatmap, but just in case, I'll mention reduce.

  1. I believe this is kind simple; it feels like .collect( method with 3 args that @Alexander Ivanchenko metioned.
  2. in order to use parallelstream, you must merges two hashmaps, I don’t think it’s a good idea, there are extra iteration, I don’t think it’s useful in this case.
HashMap<String, List<Product>> reduce = products.stream().reduce(
        new HashMap<>(),
        (result, product) -> {
            product.getCategories().stream().distinct().forEach(
                    category -> result.computeIfAbsent(category, k -> new ArrayList<>())
                            .add(product)
            );
            return result;
        },
        (x, y) -> {
            throw new RuntimeException("does not support parallel!");
        }
);
  • Related