I'm trying to write a SQL query to count the number of requests received every 15 minutes within the past hour. I want to return the last three 15 minute intervals.
My data looks like:
log_date, time
2022-11-18, 2022-11-18 17:00:00 UTC
2022-11-18, 2022-11-18 17:10:00 UTC
2022-11-18, 2022-11-18 17:20:00 UTC
2022-11-18, 2022-11-18 17:40:00 UTC
My query so far:
select log_date, FLOOR(unix_timestamp(time)) div (15 * 60) as intervals, count(*)
from data_log
where log_date >= date_sub(from_unixtime(unix_timestamp(), 'yyyy-MM-dd'), 1)
and requests_recieved = true
group by log_date, time
order by intervals desc limit 3
I want it to look like:
log_date, time, count
2022-11-18, 2022-11-18 17:00:00 UTC, 2
2022-11-18, 2022-11-18 17:15:00 UTC, 1
2022-11-18, 2022-11-18 17:30:00 UTC, 1
Update: I'm using SparkSQL 2.3.1
CodePudding user response:
Convert to time
every 15 minutes with the TIMESTAMPADD
function.
SELECT * FROM (
SELECT
log_date,
TIMESTAMPADD(
MINUTE,
hour(time) * 60 (minute(time) div 15 * 15),
date(time)) time15m,
COUNT(*) count
FROM t1
GROUP BY log_date, time15m
ORDER BY time15m DESC
LIMIT 3
) s
ORDER BY time15m;
CodePudding user response:
This is mostly about date/time fuctions:
- convert
log_time
to 15-minute interval by- truncate to nearest hour with date_format()
- add 0 to 3 15-minute intervals with date_add()
- Follow the same idea to calculate the last 15-minute interval in the past hour as start interval.
Step 1. create data_log
table and prep test data
create table data_log (
log_date date,
log_time timestamp,
requests_received boolean);
insert into data_log
values
('2022-11-18', '2022-11-18 17:55:00', true),
('2022-11-18', '2022-11-18 18:00:00', true),
('2022-11-18', '2022-11-18 18:10:00', true),
('2022-11-18', '2022-11-18 18:20:00', true),
('2022-11-18', '2022-11-18 18:40:00', true),
('2022-11-18', '2022-11-18 18:50:00', true);
Step 2. Query log entries in the past hour and aggregate to 15-minute interval:
with an_hour_ago as (
select date_add(current_timestamp, interval -1 hour) as an_hour_ago),
interval_an_hour_ago as (
select date_add(date_format(an_hour_ago, '%y-%m-%d %H:00:00'), interval 15 * floor(date_format(an_hour_ago, '%i') / 15) minute) as start_interval
from an_hour_ago)
select log_date,
date_add(date_format(log_time,'%y-%m-%d %H:00:00'), interval 15 * floor(date_format(log_time,'%i') / 15) minute ) as log_interval,
count(*) as count
from data_log d,
interval_an_hour_ago i
where d.requests_received = true
and d.log_time >= i.start_interval
group by 1, 2;
Result:
log_date |log_interval |count|
---------- ------------------- -----
2022-11-18|2022-11-18 18:00:00| 2|
2022-11-18|2022-11-18 18:15:00| 1|
2022-11-18|2022-11-18 18:30:00| 1|
2022-11-18|2022-11-18 18:45:00| 1|
CodePudding user response:
This query compatible with all MySQL versions.
SET time_zone = 'UTC';
SET @i=0;
SET @t_now = unix_timestamp(); /*Request time*/
SET @hr = 3600 * (@t_now div 3600); /*Request hour*/
SELECT log_date, from_unixtime(qr * 15 * 60 @hr) AS `time`, COUNT(qr) AS `count` FROM (
SELECT
log_date,
@i:=((@t_now - unix_timestamp(`time`)) div 60) AS m, /* minutes elapsed */
FLOOR(@i/15) as qr /*hour quarter*/
FROM data_log WHERE requests_recieved = true
) AS t1
WHERE m <= 60 /* Last hour only */
GROUP BY log_date, qr /* Group by each hour quarter */
ORDER BY log_date, qr
;
sample of results from my test data:
log_date |time |count ------------ --------------------- ----- 2022-11-19 | 2022-11-19 03:30:00 | 3 2022-11-19 | 2022-11-19 03:45:00 | 4