Home > Enterprise >  Remove objects with duplicate id and sorted values
Remove objects with duplicate id and sorted values

Time:11-02

I have a case where I want to remove objects from a list if there are duplicate ids. The items that should then be removed is the one with the oldest date. How can I do this using Java streams in a clean way? I was thinking it should be possible to like group the objects by id first and then sort them by date and only select the first object or similar but I'm struggling on how to do this.

Example:

`

package org.example;

import java.time.LocalDateTime;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class Main {

  class Student {
    private String id;
    private LocalDateTime startDatetime;

    public String getId() {
      return id;
    }

    public void setId(String id) {
      this.id = id;
    }

    public LocalDateTime getStartDatetime() {
      return startDatetime;
    }

    public void setStartDatetime(LocalDateTime startDatetime) {
      this.startDatetime = startDatetime;
    }

    public Student(String id, LocalDateTime startDatetime) {
      this.id = id;
      this.startDatetime = startDatetime;
    }
  }

  public static void main(String[] args) {
    new Main();
  }

  public Main() {
    List<Student> students = new ArrayList<>() {
      {
        add(new Student("1", LocalDateTime.now()));
        add(new Student("1", LocalDateTime.of(2000, 02, 01, 01, 01)));
        add(new Student("1", LocalDateTime.of(1990, 02, 01, 01, 01)));
        add(new Student("2", LocalDateTime.of(1990, 02, 01, 01, 01)));
      } };

    //Now this list should be sorted as the following:
    //If two or more student objects have the same id, remove the ones with the oldest startDateTime.
    //Thus, the result above should only contain 2 objects. The first object with id 1 and the LocalDateTime.now() and the second object should be the one with id 2.

    Map<String, List<Student>> groupedStudents =
        students.stream().collect(Collectors.groupingBy(Student::getId));
    
  }
}

`

CodePudding user response:

To eliminate the duplicated students (i.e. having the same id) from the list we can use an auxiliary Map.

This Map should associate a single instance of Student (the one with the latest start date) with a particular id. The proper Collector for that purpose is a three-args version of toMap() which expects:

  • a keyMapper, which generates a key from the consumed stream element;
  • a valueMapper generating a value;
  • and a mergeFunction responsible for resolving duplicates.

To implement the mergeFunction we can use static method BinaryOperator.maxBy which expects a Comparator as an argument. And to define a comparator we can make use of the Java 8 Comparator.comparing().

Finally, to generate a list of students having unique id we need to generate a stream over the values of the intermediate Map, apply sorting add collect the elements into a List.

List<Student> students = List.of(
    new Student("1", LocalDateTime.now()),
    new Student("1", LocalDateTime.of(2000, 02, 01, 01, 01)),
    new Student("1", LocalDateTime.of(1990, 02, 01, 01, 01)),
    new Student("2", LocalDateTime.of(1990, 02, 01, 01, 01))
);

List<Student> uniqueStudents = students.stream()
    .collect(Collectors.toMap(
        Student::getId,
        Function.identity(),
        BinaryOperator.maxBy(Comparator.comparing(Student::getStartDatetime))
    ))
    .values().stream()
    .sorted(Comparator.comparing(Student::getStartDatetime))
    .toList(); // for Java 16  .or collect(Collectors.toList())

Output:

Student{id='2', startDatetime=1990-02-01T01:01}
Student{id='1', startDatetime=2022-11-01T14:03:17.858753}
  • Related