Obtain the Elements of a Collection having the identical subset of Properties using Stream API-CodePudding

I have the following code:

List<MyObject> filteredObjects = myObjects.stream()
    .filter(anObject -> 
        Collections.frequency(myObjects, anObject) > 1
    )
    .collect(Collectors.toList());

Which does the job of getting me the objects from the Collection of MyObject that occur more than once.

I need to modify this code so that it gives me the objects that match one another not according to the equals(), but based on a subset of properties.

So if MyObject has properties: foo, bar and baz.

public static class MyObject {
    private Foo foo;
    private Bar bar;
    private Baz baz;
    
    // constructor, getters, etc.
}

I want the to get the objects that have a corresponding object in the collection with the same values of foo and bar.

CodePudding user response：

You can create a custom object that encompasses these properties (foo, bar) that require to match and generate an auxiliary Map using such objects as a key.

In the code below for conciseness I'm using Java 16 record to define a key, if you're using an earlier JDK version it can be replaced with a class.

record Key(Object foo, Object bar) {
    public Key(MyObject o) {
        this(o.getFoo(), o.getBar());
    }
}

When you have a Map associating all objects having equal set of properties with the same key on your hands, the only thing left is to filter the values having size greater than 1.

List<MyObject> filteredObjects = myObjects.stream()
    .collect(Collectors.groupingBy( // creates an intermediate Map<Key, List<MyObject>>
        Key::new
    ))
    .values().stream()               // Stream<List<MyObject>>
    .filter(list -> list.size() > 1)
    .flatMap(List::stream)           // Stream<MyObject>
    .toList();                       // for Java 16  or collect(Collectors.toList()) for earlier versions

In case if you need to preserve the encounter order of objects, you can use another version of the collector groupingBy() which allows to specify mapFactory:

Collectors.groupingBy(
    Key::new,
    LinkedHashMap::new,
    Collectors.toList()
)

CodePudding user response：

Taking an inspiration from Python's Counter:

static class Counter<T> {
    final ConcurrentMap<T, Integer> counts = new ConcurrentHashMap<>();

    public Counter<T> put(T it) {
        add(it, 1);
        return this;
    }

    public Counter<T> add(T it, int v) {
        counts.merge(it, v, Integer::sum);
        return this;
    }

    public List<T> frequencyGreaterThan(int n) {
        return counts
            .entrySet()
            .stream()
            .filter(e -> e.getValue() > n)
            .map(Entry::getKey)
            .collect(Collectors.toList());
    }
}

public static void main(String[] args) {
    Counter<String> counter = new Counter<>();
    counter
        .put("foo")
        .put("bar")
        .put("foo")
        .put("bar")
        .put("baz");

    // [bar, foo]
    System.out.println(counter.frequencyGreaterThan(1));
}

This uses a ConcurrentMap to maintain the frequency of elements with endless potential for methods on top.

I have added your example: frequencyGreaterThan and it can't get any more descriptive.