I have a table with weekly data acquisitions and a second table with daily data acquisitions. I need to make a union between the tables (INNER JOIN) through the id key and for the daily table values, aggregate them in an average per week.
So, my resulting table will also be with weekly dates, and for each week it should present the average value of the id in the previous 7 days (including the day of acquisition).
Data samples: DBfiddle
In this example provided, the join and group operation are returning the correct results. But he is taking the global average, not the average corresponding to the previous 7 days for each weekly acquisition date.
Does anyone have an idea of the best way to solve this?
CodePudding user response:
Subquery will be enough, but only if you are interested exactly in last 7 days for each row
select id, rating, aquisition_date ,
(select avg(value) from table2 a2 where a2.id = a1.id and
a2.aquisition_date between a1.aquisition_date - 6 and a1.aquisition_date) avg
from table1 a1
Results
id | rating | aquisition_date | avg |
---|---|---|---|
1 | 10 | 2022-04-18 | 5 |
1 | 8.2 | 2022-04-25 | 14.57142 |
In attached example you also provide time values (but due to table definition they are cuted off during insert)
CodePudding user response:
Short Answer:
You have to include the week into the JOIN
:
SELECT
table1.id
,EXTRACT(WEEK FROM table1.aquisition_date) AS week
,table1.rating AS table1_rating
,AVG(table2.value) as table2_value_avg
FROM
table1 table1
INNER JOIN
table2 table2
ON
table2.id = table1.id AND
EXTRACT(WEEK FROM table1.aquisition_date) = EXTRACT(WEEK FROM table2.aquisition_date)
GROUP BY
table1.id
,table1.rating
,table1.aquisition_date
;
This gives us
id week table1_rating table2_value_avg
1 17 8.2 20.0000000000000000
1 16 10 13.1428571428571429
Long Answer: How do we get there? Consider the inner join without the aggregation:
Your first attempt is
SELECT
table1.id
,EXTRACT(WEEK FROM table1.aquisition_date) AS week1
,EXTRACT(WEEK FROM table2.aquisition_date) AS week2
-- ,AVG(EXTRACT(WEEK FROM table2.aquisition_date)) AS week2
,table1.rating AS table1_rating
,table2.value AS table2_value
-- ,AVG(table2.value) as table2_value_avg
FROM
table1 table1
INNER JOIN
table2 table2
ON
table2.id = table1.id
This will match rows from both tables which do not belong together, e.g values from week 15 with ratings from week 16.
id week1 week2 table1_rating table2_value
1 16 15 10 2
1 16 15 10 4
[... 40 more rows]
Actually it matches every row from table1 with table2. That's why the average is the global average, even if the GROUP BY
works.
So the solution from above is removing those rows from the inner join where the weeks do not match.
N.B.: If you want to include data in your results where only one of the tables has data you need a FULL OUTER JOIN
but the idea is the same.
N.B.: (Thanks @shawnt00) (EXTRACT WEEK FROM ...) means ISO 8601 week. It starts on Monday and there might be other definitions of a week.
N.B.: (Thanks @shawnt00) The solution ignores the year. If data from more than one year is present, you need to match the year and week in the JOIN.