Home > Software design >  How to aggregate elements based on a value they share Java
How to aggregate elements based on a value they share Java

Time:11-08

I have list of Entities which share equal Group number if they belong to the same group

Start Stop GroupNum
2018-11-13 2019-01-13 1
2019-01-14 2019-03-06 1
2019-03-07 2019-11-18 1
2020-08-23 2020-08-23 2
2021-11-19 2022-12-23 2

These Entities are saved in an ArrayList

List<Entities> baseList = new ArrayList<>();
    public static class Entities {
        private final Date start_dt;
        private final Date stop_dt;
        private int groupNum;

I want to aggregate objects based on the GroupNum, and take the first Start date from the first element of the (this).group and the last stop date from the same group.

Final result should look something like this:

Start Stop GroupNum
2018-11-13 2019-11-18 1
2020-08-23 2022-12-23 2

I wanted to share my solution, however I couldn't come up with any ideas.

Thanks in advance.

CodePudding user response:

Try this.

public static void main(String[] args) throws ParseException {
    List<Entities> baseList = List.of(
        new Entities("2018-11-13", "2019-01-13", 1),
        new Entities("2019-01-14", "2019-03-06", 1),
        new Entities("2019-03-07", "2019-11-18", 1),
        new Entities("2020-08-23", "2020-08-23", 2),
        new Entities("2021-11-19", "2022-12-23", 2));
    List<Entities> result = baseList.stream()
        .collect(Collectors.groupingBy(Entities::getGroupNum))
        .entrySet().stream()
        .map(e -> new Entities(
            e.getValue().get(0).getStart_dt(),
            e.getValue().get(e.getValue().size() - 1).getStop_dt(),
            e.getKey()))
        .toList();
    result.forEach(System.out::println);
}

output:

Entities [start_dt=2018-11-13, stop_dt=2019-11-18, groupNum=1]
Entities [start_dt=2020-08-23, stop_dt=2022-12-23, groupNum=2]

CodePudding user response:

I assume you want to aggregate Entities as follows:

  • groupNum will be the grouping criteria
  • earliest start_dt and latest stop_dt should be used.

As I already suggested in my comment, try to use a Map<Integer, Entities> where the key is the group number:

Map<Integer, Entities> aggregates = new HashMap<>();

for( Entities entry : baseList ) {
  //get the aggregate or create it if none exists for this group
  Entities aggregate = aggregates.computeIfAbsent(entry.getGroupNum(), 
     k -> new Entities(entry.getStart_Dt(), entry.getStop_Dt(), entry.getGroupNum());

  //compare and update the dates as needed
  if( aggregate.getStart_Dt().compareTo(entry.getStart_Dt()) > 0) {
    aggregate.setStart_Dt(entry.getStart_Dt());
  }

  if( aggregate.getStop_Dt().compareTo(entry.getStop_Dt()) < 0) {
    aggregate.setStart_Dt(entry.getStop_Dt());
  }
}

A few notes:

  • This approach doesn't require the list to be sorted and doesn't maintain any order. If order by group number should be maintained either use a TreeMap to sort by group number or a LinkedHashMap to keep the order of the list (as much as applicable - you're merging list elements after all)

  • If you'd use the java.time, e.g. LocalDate instead of java.util.Date you could use the easier to read methods isBefore() and isAfter().

  • If you don't want to check the dates on a newly created aggregate use the following snippet as the loop body:

    Entities aggregate = aggregates.get(entry.getGroupNum());
    if( aggregate == null ) {
      aggregates.put( entry.getGroupNum(), new Entities(entry.getStart_Dt(), entry.getStop_Dt(), entry.getGroupNum());
    } else {
      if( aggregate.getStart_Dt().compareTo(entry.getStart_Dt()) > 0) {
        aggregate.setStart_Dt(entry.getStart_Dt());
      }
    
      if( aggregate.getStop_Dt().compareTo(entry.getStop_Dt()) < 0) {
        aggregate.setStart_Dt(entry.getStop_Dt());
      }
    }
    
  • Related