I've stripped this down as much as possible, so I hope it's easy to understand. Currently, I have two tables with website data, display
and traffic
. Both tables have the columns date
and site
.
display
has revenue
,
traffic
has pageviews
.
My end goal is to produce results with revenue and pageviews for each of our sites, grouped by week. So, four columns: week
, site
, pageviews
, revenue
- where week
is week(date) AS week
.
Here is my best attempt so far:
SELECT display.site,
Week(display.date) AS dateweek,
Sum(revenue) AS revenue,
tr.pageviews
FROM display
INNER JOIN (SELECT traffic.site,
Week(date) AS date_granularity,
Sum(pageviews) AS pageviews
FROM traffic
GROUP BY traffic.site,
date_granularity) AS tr
ON tr.date_granularity = Week(display.date)
WHERE display.site = 'Google'
GROUP BY dateweek,
display.site,
tr.pageviews
(Google is in place of the actual site name of our client)
I will take week 17 from the results grid as an example:
-------- ---------- -------------------- -----------
| site | dateweek | revenue | pageviews |
-------- ---------- -------------------- -----------
| Google | 17 | 1424.0631110000002 | 1151716 |
| Google | 17 | 1424.0631110000002 | 1169364 |
| Google | 17 | 1424.0631110000002 | 1427273 |
| Google | 17 | 1424.0631110000002 | 6776535 |
-------- ---------- -------------------- -----------
Doing queries from the individual tables with no joins, this is the correct and expected output for week 17:
-------- ---------- -------------------- -----------
| site | dateweek | revenue | pageviews |
-------- ---------- -------------------- -----------
| Google | 17 | 1424.0631110000002 | 1169364 |
-------- ---------- -------------------- -----------
It feels like I'm so close, as the row I expect I'm already getting, but there's some random extras. Any ideas? If there's anything that's unclear I'm happy to add more detail.
CodePudding user response:
If you only put one ON
condition, the JOIN
will match the condition only. In your case, you're matching just the WEEK
so that means regardless of any site from the subquery, as long as the WEEK
value match, it will return result. The fix is very simple, you just need to add 1 more condition in ON
, so:
SELECT display.site,
Week(display.date) AS dateweek,
Sum(revenue) AS revenue,
tr.pageviews
FROM display
INNER JOIN (SELECT traffic.site,
Week(date) AS date_granularity,
Sum(pageviews) AS pageviews
FROM traffic
GROUP BY traffic.site,
date_granularity) AS tr
ON tr.date_granularity = Week(display.date)
/*add another condition here*/
AND tr.site=display.site
WHERE display.site = 'Google'
GROUP BY dateweek,
display.site,
tr.pageviews