Home > Back-end >  Grouping by query results based on a date using stream API java
Grouping by query results based on a date using stream API java

Time:09-21

I have an entity object with several columns which looks like this:

@Entity
@Table(name = "stats", schema = "main")
@Getter @Setter
@AllArgsConstructor @NoArgsConstructor
public class StatsEntity implements Serializable {
    private static final long serialVersionUID = 743865321018464769L;

    @EmbeddedId
    private StatsEntityIds id;

    @Column(length = 500, name = "total_count")
    private Integer totalCount;

    @Column(length = 500, name = "success")
    private Integer success;

    @Column(length = 500, name = "errors")
    private Integer errors;
}

As I use a composite key I have an embeddable class

@Embeddable
@Getter  @Setter
public class StatsEntityIds implements Serializable {
    private static final long serialVersionUID = 743865321018464768L;

    @Column(length = 255, name = "date")
    private String date;

    @Column(length = 100, name = "workflow")
    private String workflow;

    @Column(length = 100, name = "type")
    private String type;
}

I also have a repository class and I use several request. The following query allow to select database results based on a query on workflow value.

@Repository
public interface StatsRepository extends JpaRepository<StatsEntity, StatsEntityIds> {
    List<StatsEntity> findByIdWorkflowOrderByIdDate(String workflow);
}

String dates have the following format: 2022-04-05 (YYYY-MM-dd). My issue is the following. the query is working fine in my service layer and i do obtain the right outputs:

@Override
public List<StatsDto> findByIdWorkflowOrderByIdDate(DsiPilotStatisticsRequest request) {
    ModelMapper modelMapper = mappingConfiguration();
    List<StatsEntity> statEntities = this.statsRepository.findByIdWorkflowOrderByIdDate(request.getWorkflow());
    return modelMapper.map(statEntities, new TypeToken<List<StatsDto>>(){}.getType());
}

Here is where i'm a little bit stuck. I want to operate a group by using stream. For example the previous query is returning 1000 results on 300 days. Let's consider I want to group them and do sums on numeric columns.

For grouping by full date, I implemented the following stream that return my DTO object and do make sums for each day.

return this.statsRepository.findByIdWorkflowOrderByIdDate(request.getWorkflow())
        .parallelStream()
        .map(statEntity -> new StatsDto(
                statEntity.getId().getDate(),
                statEntity.getId().getWorkflow(),
                statEntity.getTotalCount(),
                statEntity.getSuccess(),
                statEntity.getErrors()))
        .collect(Collectors.toMap(
                StatsDto::getDate,
                Function.identity(),
                (val1, val2) -> new StatsDto(
                        val1.getDate(),
                        val1.getWorkflow(),
                        Integer.sum(val1.getTotalCount(), (val2.getTotalCount())),
                        Integer.sum(val1.getSuccess(), (val2.getSuccess())),
                        Integer.sum(val1.getErrors(), (val2.getErrors()))
                )
        ))
        .values()
        .stream()
        .sorted(Comparator.comparing(StatsDto::getDate))
        .collect(Collectors.toList());

Now I would like to build a similar query that would sum my values based on week and month.

CodePudding user response:

As I see, your code is a part of a Spring-application.

To begin with, it's not justifiable to use String to represent a date. Replace it with LocalDate and message converter will take care about parsing.

If for some reason you can't do this change, then you would need to parse each string manually using LocalDate.parse(CharSequence), specifying a DateTimeFormatter is unnecessary because format of your strings is compliant with ISO-8601 standard.

Grouping data by Month

When you have LocalDate instance on your hands, to extract the Month from it, you can use LocalDate.getMonth().

If date property would be of type LocalDate, then in the groupingBy() you can provide the following method reference as a classifier function :

LocalDate::getMonth

In case if you would need to differentiate between the months of different years (it might be necessary if the range would be greater than mentioned 300 days), then you can use YearMonth as a key while grouping the data (credits to @Ole V.V. for this idea).

The classifier function might look like this:

date -> YearMonth.of(date.getYear(), date.getMonth())

Grouping data by the Week of Year

In order to group the data by the number of week, you would need a bit more work with Time API.

One of the ways to do that is to use method LocalDate.get() which expects a TemporalField as an argument.

To provide this augment, you can make use of WeekFields class. To obtain its instance you can utilize either WeekFields.ISO if you required representation of the week should conform to the ISO-8601 definition, or WeekFields.of(Locale) if you need representation of the week specific to a particular local.

To obtain the required instance of TemporalField you need invoke weekOfWeekBasedYear() on the WeekFields instance.

And classifier function of groupingBy() might be written like this:

date -> date.get(WeekFields.ISO.weekOfWeekBasedYear())

For more information, refer to this question: Get the weeknumber from a given date in Java FX

CodePudding user response:

For month replace StatsDto::getDate by

LocalDate.parse(stateEntity.getDate(), DateTimeFormatter.ISO_DATE)
                        .with(TemporalAdjusters.firstDayOfMonth())

or week by

LocalDate.parse(dateString, DateTimeFormatter.ISO_DATE)
                        .with(TemporalAdjusters.nextOrSame(DayOfWeek.SUNDAY))

CodePudding user response:

Replace the first parameter of toMap() function to group the data by your own logic. And then do the same replacement of the raw date value.


 .collect(Collectors.toMap(
                dto -> this.getWeek(dto.getDate()),
                Function.identity(),
                (val1, val2) -> new StatsDto(
                        this.getWeek(val1.getDate()),
                        val1.getWorkflow(),
                        Integer.sum(val1.getTotalCount(), (val2.getTotalCount())),
                        Integer.sum(val1.getSuccess(), (val2.getSuccess())),
                        Integer.sum(val1.getErrors(), (val2.getErrors()))
                )
        ))
//this method returns the Monday then you could group by it
private String getWeek(String date){
        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd");
        LocalDate localDate = LocalDate.from(formatter.parse(date));
        return formatter.format(localDate.with(ChronoField.DAY_OF_WEEK, 1));
}

then the result has group by week and the date of each element is Monday.

  • Related