Assume we have a person class with fields:
Class Person {
private String name;
private Integer id (this one is unique);
}
And then we have a List<Person> people
such that:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Jerry', 112]
['Shannon', 259]
['Shannon', 533]
How can I make a new List<Person> uniqueNames
such that it filters for unique names only AND keeps the highest ID of that name.
So the end list would look like:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Shannon', 533]
CodePudding user response:
Collectors.groupingBy
Collectors.maxBy
should do the trick to build the map of persons grouped by name and then selecting the max value:
List<Person> persons = Arrays.asList(
new Person("Jerry", 123),
new Person("Tom", 234),
new Person("Jerry", 456),
new Person("Jake", 789)
);
List<Person> maxById = persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
Collectors.maxBy(Comparator.comparingInt(Person::getID))
))
.values() // Collection<Optional<Person>>
.stream() // Stream<Optional<Person>>
.map(opt -> opt.orElse(null))
.collect(Collectors.toList());
System.out.println(maxById);
Output:
[789: Jake, 234: Tom, 456: Jerry]
Update
is there a way to get a separate list of the Person object who were deleted because they were duplicates within this stream()?
It may be better to collect the grouped items in a list which should be converted then in some wrapper class providing information about the maxById
person and the list of deduped persons:
class PersonList {
private final Person max;
private final List<Person> deduped;
public PersonList(List<Person> group) {
this.max = Collections.max(group, Comparator.comparingInt(Person::getID));
this.deduped = new ArrayList<>(group);
this.deduped.removeIf(p -> p.getID() == max.getID());
}
@Override
public String toString() {
return "{max: " max "; deduped: " deduped "}";
}
}
Then the persons should be collected like this:
List<PersonList> maxByIdDetails = new ArrayList<>(persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
LinkedHashMap::new,
Collectors.collectingAndThen(
Collectors.toList(), PersonList::new
)
))
.values()); // Collection<PersonList>
maxByIdDetails.forEach(System.out::println);
Output:
{max: 456: Jerry; deduped: [123: Jerry]}
{max: 234: Tom; deduped: []}
{max: 789: Jake; deduped: []}
Update 2
Getting list of duplicated persons:
List<Person> duplicates = persons
.stream()
.collect(Collectors.groupingBy(Person::getName))
.values() // Collection<List<Person>>
.stream() // Stream<List<Person>>
.map(MyClass::removeMax)
.flatMap(List::stream) // Stream<Person>
.collect(Collectors.toList()); // List<Person>
System.out.println(duplicates);
Output:
[123: Jerry]
where removeMax
may be implemented like this:
private static List<Person> removeMax(List<Person> group) {
List<Person> dupes = new ArrayList<>();
Person max = null;
for (Person p : group) {
Person duped = null;
if (null == max) {
max = p;
} else if (p.getID() > max.getID()) {
duped = max;
max = p;
} else {
duped = p;
}
if (null != duped) {
dupes.add(duped);
}
}
return dupes;
}
Or, providing that hashCode
and equals
are implemented properly in class Person
, the difference between the two lists may be calculated using removeAll
:
List<Person> duplicates2 = new ArrayList<>(persons);
duplicates2.removeAll(maxById);
System.out.println(duplicates2);
CodePudding user response:
You could try:
import static java.util.stream.Collectors.*;
persons.stream()
.collect(
groupingBy(
Person::getName,
collectingAndThen(
maxBy(Person::getId),
Optional::get
)
)
)
.values()
;
- You group by name
- Then you request the max of grouped person (per name)
- Then you returns the values (since
groupingBy
returns aMap<String, Optional<Person>>
, thecollectAndThen
call'sOptional::get
).
Note that this will list unique names, but not duplicates names.
CodePudding user response:
You can use Collectors#toMap
like this.
record Person(String name, Integer id) {}
public static void main(String[] args) {
List<Person> input = List.of(
new Person("Jerry", 993),
new Person("Tom", 3),
new Person("Neal", 443),
new Person("Jerry", 112),
new Person("Shannon", 259),
new Person("Shannon", 533));
List<Person> output = input.stream()
.collect(Collectors.toMap(Person::name, Function.identity(),
(a, b) -> a.id() > b.id() ? a : b, LinkedHashMap::new))
.values().stream().toList();
for (Person e : output)
System.out.println(e);
}
output:
Person[name=Jerry, id=993]
Person[name=Tom, id=3]
Person[name=Neal, id=443]
Person[name=Shannon, id=533]
You can omit the , LinkedHashMap::new
if you don't care about the order.
CodePudding user response:
is there a way to get a separate list of the Person object who were deleted because they were duplicates within this stream()?
private static final Map<String, Person> highestIds = new HashMap<>();
private static final List<Person> duplicates = new ArrayList<>();
public static void main(String[] args) {
for (Person person : people) {
Person result = highestIds.get(person.name);
if (isPresent(result) && person.id > result.id) {
duplicates.add(result);
highestIds.put(person.name, person);
} else if (result == null) {
highestIds.put(person.name, person);
} else {
duplicates.add(person);
}
}
System.out.println("Highest ids:");
highestIds.values().forEach(System.out::println);
System.out.println("Duplicates:");
duplicates.forEach(System.out::println);
}
private static boolean isPresent(Person result) {
return result != null;
}