Home > front end >  Using a GROUP BY along with an aggregate function to scope the applied query
Using a GROUP BY along with an aggregate function to scope the applied query

Time:11-08

Looking for help writing a mysql query.

I need to find the amount of time in seconds in total that there are between all columns on the child table that belong to a parent_id. The complication is that the difference in seconds needs to be scoped/grouped by the group_id.

Parent table

id
1
2

Child table

id parent_id group_id created_at
1 1 1 2022-01-01 00:00:00
2 1 1 2022-01-01 00:01:00
3 1 2 2022-01-01 00:02:00
4 1 2 2022-01-01 00:04:00
5 2 3 2022-01-01 00:10:00
6 2 3 2022-01-01 00:12:00

Actual result

Parent table

id child_elapsed_time_in_seconds
1 240
2 120

Desired result

Parent table

id child_elapsed_time_in_seconds
1 180
2 120

I know I likely need to use TIMESTAMPDIFF(SECOND, MIN(created_at), MAX(created_at)) along with a GROUP BY but im just not sure on the best way to achieve this.

Here is my current query.

SELECT
    id, (
        SELECT
            TIMESTAMPDIFF(SECOND, MIN(created_at), MAX(created_at))
        FROM
            `child`
        WHERE
            `parent`.`id` = `child`.`parent_id`
        LIMIT 1
    ) AS `child_elapsed_time_in_seconds`
FROM
    `parent`

@GMB's answer helped me figure out my final sql query.

SELECT
    id,
    (
        SELECT SUM(c.elapsed) AS child_elapsed_time_in_seconds
        FROM (
            SELECT group_id, TIMESTAMPDIFF(SECOND, min(created_at), max(created_at)) AS elapsed
            FROM child
            WHERE `parent_id` = `parent`.`id`
            GROUP BY group_id
        ) AS c
    ) AS `child_elapsed_time_in_seconds`
FROM
    `parent`

CodePudding user response:

We can do this with two levels of aggregation:

select parent_id, sum(elapsed) child_elapsed_time_in_seconds
from (
    select parent_id, group_id, 
        timestampdiff(second, min(created_at), max(created_at)) as elapsed
    from child_table
    group by parent_id, group_id
) c
group by parent_id

The subquery aggregates by parent id and by group, and compute the difference between the earliest and the latest timestamps. Then, the outer query aggregates by parent id only, and sums the timestamp diffs.

Note that we don't need parent_table to get the result that we want. But if you need it for some reason, you can bring it with a join:

select p.*, c.child_elapsed_time_in_seconds
from (    
    select parent_id, sum(elapsed) child_elapsed_time_in_seconds
    from (
        select parent_id, group_id, 
            timestampdiff(second, min(created_at), max(created_at)) as elapsed
        from child_table
        group by parent_id, group_id
    ) c
    group by parent_id
) c
inner join parent_table p on p.id = c.parent_id
  • Related