Home > Mobile >  What is the best way to group timestamps into intervals of given length in Java?
What is the best way to group timestamps into intervals of given length in Java?

Time:11-05

I have a table for metrics in Oracle DB where one of the columns is the timestamp. I need to read the metrics from the DB and group them into given intervals of any length( 2 Month or 3 Hour or 1 Days or 2 years etc) between a starting timestamp and ending timestamp. The timestamp will be of format

2020-05-24T18:51:10.018-07:00

I know I can read all the entries from the table and sort them and group them into intervals by converting them all into seconds, but is there a better way to do it ?

CodePudding user response:

You may use match_recognize for this.

with t(ts) as (
  select
    current_timestamp
      interval '3' minute
    * dbms_random.value(0, level)
  from dual
  connect by level < 20
)
select *
from t
match_recognize(
  order by ts asc
  measures
    match_number() as grp
  all rows per match
  pattern(a w5min*)
  define
    /*Rows within 5 minutes after the first row in bucket*/
    w5min as ts - first(ts) < interval '5' minute
)
TS                  | GRP
:------------------ | --:
2021-11-03 06:20:40 |   1
2021-11-03 06:20:56 |   1
2021-11-03 06:23:27 |   1
2021-11-03 06:23:49 |   1
2021-11-03 06:25:23 |   1
2021-11-03 06:25:36 |   1
2021-11-03 06:32:14 |   2
2021-11-03 06:34:38 |   2
2021-11-03 06:36:29 |   2
2021-11-03 06:36:59 |   2
2021-11-03 06:39:29 |   3
2021-11-03 06:40:17 |   3
2021-11-03 06:41:07 |   3
2021-11-03 06:47:14 |   4
2021-11-03 06:48:31 |   4
2021-11-03 06:52:29 |   5
2021-11-03 06:59:22 |   6
2021-11-03 07:02:05 |   6
2021-11-03 07:04:54 |   7

db<>fiddle here

CodePudding user response:

Whether it’s the best way, who can tell? If you want a Java solution, I suggest this one.

    OffsetDateTime[] timestamps = {
            OffsetDateTime.parse("2020-05-24T18:51:10.018-07:00"),
            OffsetDateTime.parse("2020-03-07T23:45:02.399-08:00"),
            OffsetDateTime.parse("2020-05-24T20:01:11.442-07:00"),
            OffsetDateTime.parse("2020-03-08T01:03:05.079-08:00"),
            OffsetDateTime.parse("2020-05-24T19:32:34.461-07:00"),
    };
    TemporalAmount intervalLength = Duration.ofHours(2);
    
    List<OffsetDateTime> list = new ArrayList<>(Arrays.asList(timestamps));
    list.sort(Comparator.naturalOrder());
    
    List<List<OffsetDateTime>> groups = new ArrayList<>();
    if (! list.isEmpty()) {
        List<OffsetDateTime> currentGroup = new ArrayList<>();
        Iterator<OffsetDateTime> itr = list.iterator();
        OffsetDateTime first = itr.next();
        currentGroup.add(first);
        OffsetDateTime groupEnd = first.plus(intervalLength);
        while (itr.hasNext()) {
            OffsetDateTime current = itr.next();
            if (current.isBefore(groupEnd)) {
                currentGroup.add(current);
            } else { // new group
                groups.add(currentGroup);
                currentGroup = new ArrayList<>();
                groupEnd = current.plus(intervalLength);
                currentGroup.add(current);
            }
        }
        // remember to add last group
        groups.add(currentGroup);
    }
    
    groups.forEach(System.out::println);

Output:

[2020-03-07T23:45:02.399-08:00, 2020-03-08T01:03:05.079-08:00]
[2020-05-24T18:51:10.018-07:00, 2020-05-24T19:32:34.461-07:00, 2020-05-24T20:01:11.442-07:00]

The advantage of declaring intervalLength a TemporalAmount is you are free to assign either a time-based Duration to it (as above) or a date-based Period.

    TemporalAmount intervalLength = Period.ofMonths(3);

In this case the result is just one group:

[2020-03-07T23:45:02.399-08:00, 2020-03-08T01:03:05.079-08:00, 2020-05-24T18:51:10.018-07:00, 2020-05-24T19:32:34.461-07:00, 2020-05-24T20:01:11.442-07:00]

Link

Oracle tutorial: Date Time explaining how to use java.time.

  • Related