Home > OS >  Is there a simpler way to remove "duplicate" objects from an array (objects with the same
Is there a simpler way to remove "duplicate" objects from an array (objects with the same

Time:11-24

If, given an array of objects, such as:

ArrayList<Person> people = new ArrayList<>(Arrays.aslist(
new Person("Victoria", 25, "Firefighter"),
new Person("Grace", 27, "Footballer"),
new Person("Samantha", 25, "Stock Broker"),
new Person("Victoria", 23, "Poker Player"),
new Person("Jane", 27, "Footballer"),
new Person("Grace", 25, "Security Guard"));

How can one remove any objects that don't have a unique attributes, whilst leaving only one. This could be as simple as duplicate names, which would leave:

Person("Victoria", 25, "Firefighter"),
Person("Grace", 27, "Footballer"),
Person("Samantha", 25, "Stock Broker"),
Person("Jane", 27, "Footballer")

Or more complex, such as jobs that start with the same letter, and the same age:

Person("Victoria", 25, "Firefighter"),
Person("Grace", 27, "Footballer"),
Person("Samantha", 25, "Stock Broker"),
Person("Victoria", 23, "Poker Player"),

So far, the best I've come up with is:

    int len = people.size();
    for (int i = 0; i < len - 1; i  ) {
        for (int j = i   1; j < len; j  )
            if (function(people.get(i), people.get(j))) {
                people.remove(j);
                j--;
                len--;
            }
    }

With "function" checking if the entries are considered "duplicates"

I was wondering if there's a library that does just this, or if you could somehow put this in a lambda expression

CodePudding user response:

If you say "remove duplicates", the first thing which comes into my mind, is using a Set. However, Set considers an object as "duplicate" if the set already contains an object which is "equal" to that object, by means of the equals method. Implementing Person::equals to check for a job's first letter is not a good fit.

You want to have an 'equals method' for this use case alone. So we have to use something else.

The Stream interface contains a distinct() method to check for duplicates, but distinct doesn't take a parameter where you can pass in a sort of Comparator or Predicate to define when a Person is considered "distinct" from another Person.

Fortunately, this excellent StackOverflow answer provides exactly what you need:

public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
    Set<Object> seen = ConcurrentHashMap.newKeySet();
    return t -> seen.add(keyExtractor.apply(t));
}

Now the next thing you must do, is create a record to collect the appropriate object properties:

record PersonAgeAndJobFilter(int age, char jobFirstLetter) {

    public static PersonAgeAndJobFilter ofPerson(Person p) {
        return new PersonAgeAndJobFilter(p.getAge(), p.getJob().charAt(0));
    }
}

Then stream over the people, using your filter:

people.stream()
    .filter(distinctByKey(PersonAgeAndJobFilter::ofPerson))
    .collect(Collectors.toSet());

CodePudding user response:

ArrayList new_people=people.stream().distinct().collect(Collectors.toList()); I hope It will work for you.

CodePudding user response:

You can use a map to pick distinct values by the attribute. This is a method that can be called with a list of objects and a key mapper that picks the attribute that determines uniqueness:

private static <T> List<T> uniqueBy(List<T> objects, 
         Function<T, Object> keyExtractor) {
    return new ArrayList<>(objects.stream()
            .collect(Collectors.toMap(keyExtractor, 
                                      Function.identity(), 
                                      (a, b) -> a, 
                                      LinkedHashMap::new)).values());
}

That uses LinkedHashMap to preserve order from the source list. The method can be used in this way:

List<Person> uniqueByName = uniqueBy(people, Person::getName); //by name
List<Person> uniqueByAge = uniqueBy(people, Person::getAge); //by age
//etc.
  • Related