I have the following scenario
System | Subsystem & Filename | File Load Start Time | File Load End Time |
---|---|---|---|
Alpha | A1 transactiontxt | 2022-06-19 08:00:00 | 2022-06-19 08:00:02 |
Alpha | A2 userscsv | 2022-06-19 08:00:02 | 2022-06-19 08:00:05 |
Alpha | A2 employeescsv | 2022-06-19 08:00:05 | 2022-06-19 08:00:08 |
Alpha | A1 managerscsv | 2022-06-19 08:00:08 | 2022-06-19 08:00:16 |
Alpha | A3 customerscsv | 2022-06-19 08:00:01 | 2022-06-19 08:00:04 |
Gamma | A1 transactiontxt | 2022-06-19 10:00:48 | 2022-06-19 10:00:53 |
Gamma | A2 userscsv | 2022-06-19 10:00:53 | 2022-06-19 10:00:54 |
Gamma | A2 employeescsv | 2022-06-19 10:00:27 | 2022-06-19 10:00:30 |
Gamma | A1 managerscsv | 2022-06-19 10:00:11 | 2022-06-19 10:00:17 |
Gamma | A3 customerscsv | 2022-06-19 10:00:13 | 2022-06-19 10:00:14 |
I want to be able to group the summary statistics by System. The info needed is when the overall started (earliest time), when it ended (latest time), and the time it took for each subsystem to occur, in seconds. From the example above, the result should look as below:
System | Overall System Load Start Time | Overall System Load End Time | A1 Time Taken | A2 Time Taken | A3 Time Taken |
---|---|---|---|---|---|
Alpha | 2022-06-19 08:00:00 | 2022-06-19 08:00:16 | 00:00:10 | 00:00:06 | 00:00:03 |
Gamma | 2022-06-19 10:00:11 | 2022-06-19 10:00:54 | 00:00:11 | 00:00:04 | 00:00:01 |
I cannot find a way to do this in a query, I'm trying to do select subqueries in the select clause for each column, and at the end group by only System. But this is not possible because I'd have to use an aggregate function which is not being supported with case statements in subqueries in the select clause
My approach was something like
SELECT System,
min(StartTime) as 'File Load Start Time',
max(EndTime) as 'File Load End Time',
CASE WHEN SubSystem LIKE 'A1%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A1 Time Taken',
CASE WHEN SubSystem LIKE 'A2%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A2 Time Taken',
CASE WHEN SubSystem LIKE 'A3%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A3 Time Taken'
FROM TABLE GROUP BY SYSTEM
But this does not work because the case statements need to be in a group by clause as well, and I cannot aggregate them
CodePudding user response:
I am assuming your subsystems are finite and doesn't require dynamic columns (pivot/crosstab).
Folloing query should give you the desired output.
SELECT
system_name AS "System",
min(file_load_start_time) AS "Overall System Load Start Time",
max(file_load_end_time) AS "Overall System Load End Time",
(
SELECT
sec_to_time(sum(timestampdiff(SECOND, tsl_a1.file_load_start_time, tsl_a1.file_load_end_time)))
FROM
t_system_log tsl_a1
WHERE
tsl_a1.system_name = tsl.system_name
AND LEFT(tsl_a1.subsystem_filename,
2) = 'A1') AS "A1 Time Taken",
(
SELECT
sec_to_time(sum(timestampdiff(SECOND, tsl_a2.file_load_start_time, tsl_a2.file_load_end_time)))
FROM
t_system_log tsl_a2
WHERE
tsl_a2.system_name = tsl.system_name
AND LEFT(tsl_a2.subsystem_filename,
2) = 'A2') AS "A2 Time Taken",
(
SELECT
sec_to_time(sum(timestampdiff(SECOND, tsl_a3.file_load_start_time, tsl_a3.file_load_end_time)))
FROM
t_system_log tsl_a3
WHERE
tsl_a3.system_name = tsl.system_name
AND LEFT(tsl_a3.subsystem_filename,
2) = 'A3') AS "A3 Time Taken"
FROM
t_system_log tsl
GROUP BY
system_name;
In case, your subsystems are dynamic, you should use pivot query instead of subquery. Also subquery might impact on the performance.
CodePudding user response:
Your original query just needs a little bit of tweaking in order to work properly. I tested it in workbench. It works now.
SELECT System,
min(StartTime) as 'File Load Start Time',
max(EndTime) as 'File Load End Time',
sum( case when left(subsystem,2)='a1' then to_seconds(endtime)-to_seconds(starttime) else 0 end
) as 'A1 Time Taken',
sum( case when left(subsystem,2)='a2' then to_seconds(endtime)-to_seconds(starttime) else 0 end
) as 'A2 Time Taken',
sum( case when left(subsystem,2)='a3' then to_seconds(endtime)-to_seconds(starttime) else 0 end
) as 'A3 Time Taken'
FROM TABLE GROUP BY SYSTEM
;