I have the following table;
| uuid | user_id | user_created | profile_edited |
|:---- |:------:| -----:|:------:|:------:|
| 1c5d134c | user_3 | 2022-11-10T19:09:05 00:00 | 2022-11-18T18:00:05 00:00
| 1c5d134b | user_3 | 2022-11-10T19:09:05 00:00 | 2022-11-15T18:00:05 00:00
| 1c5d134a | user_3 | 2022-11-10T19:09:05 00:00 | 2022-11-14T18:00:05 00:00
| 1c5d434a | user_1 | 2022-11-10T19:09:05 00:00 | 2022-11-13T19:09:05 00:00
| 1c8b424b | user_1 | 2022-11-10T19:09:05 00:00 | 2022-11-10T16:09:05 00:00
| 1c8b424c | user_2 | 2022-11-01T19:09:05 00:000 | 2022-11-19T19:09:05 00:00
| 1c8b424a | user_2 | 2022-11-01T19:09:05 00:000 | 2022-11-22T19:09:05 00:00
| 1c8b424b | user_2 | 2022-11-01T19:09:05 00:000 | 2022-11-24T19:09:05 00:00
| 1c5d434c | user_1 | 2022-11-10T19:09:05 00:00 | 2022-11-18T19:09:05 00:00
| 1c5d434e | user_1 | 2022-11-10T19:09:05 00:00 | 2022-11-16T19:09:05 00:00
| 1c5d434c | user_1 | 2022-11-10T19:09:05 00:00 | 2022-11-14T19:09:05 00:00
In the example above we can see the second and third profile edit times as below;
| user_id | second_edit | third_edit | difference
|:---- |:------:| -----:|:------:|:------:|
| user_1 | 2022-11-13T19:09:05 00:00 | 2022-11-14T19:09:05 00:00 | 24 hours
| user_2 | 2022-11-22T19:09:05 00:00 | 2022-11-24T19:09:05 00:00 |48 hours
| user_3 | 2022-11-15T18:00:05 00:00 | 2022-11-18T18:00:05 00:00 | 72 hours
And the resulting query should find the median between 24, 48, and 72.
CodePudding user response:
In POSTGRES you can do something like this.
first get the row_number for every uuid descending by the date edited, The rest is getting the difference.
You can use this and derive similar solutions for MySQL
WITH CTE AS(
SELECT
"uuid", "user_id", "user_created","profile_edited",
ROW_NUMBER() OVER (PARTITION BY "uuid" ORDER BY "profile_edited" DESC) rn
FROM TableB)
SELECT
"uuid",string_agg(DISTINCT "user_id", ','), MAX("user_created") "user_created",
AGE((SELECT "profile_edited" FROM CTE WHERE rn = 2 AND "uuid" = CTE."uuid" LIMIT 1)
, (SELECT "profile_edited" FROM CTE WHERE rn = 3 AND "uuid" = CTE."uuid" LIMIT 1)) AS difference
FROM CTE
GROUP BY "uuid"
uuid | string_agg | user_created | difference |
---|---|---|---|
1c5d434c | user_1 | 2022-11-10 19:09:05 | 2 days |
1c8b424a | user_1,user_2 | 2022-11-10 19:09:05 | 2 days |
SELECT 2
To your changed Question
WITH CTE AS(
SELECT
"user_id", "user_created","profile_edited",
ROW_NUMBER() OVER (PARTITION BY "user_id" ORDER BY "profile_edited" ASC) rn
FROM TableB),
CTE2 AS(
SELECT
"user_id"
,MAX("profile_edited") filter (where rn = 2 AND "user_id" = CTE."user_id") "second"
,MAX("profile_edited") filter (where rn = 3 AND "user_id" = CTE."user_id") "third",
abs(extract(epoch from
MAX("profile_edited") filter (where rn = 2 AND "user_id" = CTE."user_id") -
MAX("profile_edited") filter (where rn = 3 AND "user_id" = CTE."user_id")
)/3600)::INTEGER AS difference_hours
FROM CTE
GROUP BY "user_id")
SELECT * FROM CTE2
UNION ALL
SELECT 'Median',NULL,NULL,AVG(difference_hours)::INTEGER FROM CTE2
user_id | second | third | difference_hours |
---|---|---|---|
user_1 | 2022-11-13 19:09:05 | 2022-11-14 19:09:05 | 24 |
user_2 | 2022-11-22 19:09:05 | 2022-11-24 19:09:05 | 48 |
user_3 | 2022-11-15 18:00:05 | 2022-11-18 18:00:05 | 72 |
Median | null | null | 48 |
SELECT 4
CodePudding user response:
In postgres you can use the percentile_cont
ordered-set aggregate function to find the median as the following:
with edit_diff as
(
select round(extract('epoch' from max(profile_edited)-min(profile_edited))/3600) difference
from
(
select *,
row_number() over (partition by user_id order by profile_edited) rn
from table_name
) T
where rn in(2,3)
group by user_id
)
select percentile_cont(0.5) within group (order by difference) median
from edit_diff
See a demo.
OUTPUT: 48
For MySQL you can use row_number()
and count
functions to calculate the median as the following:
with edit_diff as
(
select timestampdiff(hour, min(profile_edited), max(profile_edited)) difference
from
(
select *,
row_number() over (partition by user_id order by profile_edited) rn
from table_name
) T
where rn in(2,3)
group by user_id
)
select round(avg(difference), 2) median
from
(
select difference,
row_number() over (order by difference) diffOrder,
count(*) over () cn
from edit_diff
) T
where diffOrder between floor((cn 1) / 2) and ceil((cn 1) / 2);