I'm having trouble with these two tables:
ad_spend
| date | campaign | spend |
| ------- | ------ | ----- |
| 1/1/2021| C | 500 |
| 1/1/2021| D | 500 |
| 1/1/2021| C | 500 |
| 1/1/2021| D | 500 |
| 1/2/2021| C | 500 |
| 1/2/2021| D | 500 |
| 1/2/2021| C | 500 |
| 1/3/2021| D | 500 |
...
signups_from_ad
| date | campaign | customer_id |
| ------- | ------ | ----------- |
| 1/1/2021| C | 1 |
| 1/1/2021| C | 2 |
| 1/1/2021| D | 3 |
| 1/1/2021| C | 4 |
| 1/2/2021| C | 5 |
| 1/2/2021| D | 6 |
| 1/2/2021| D | 7 |
| 1/3/2021| D | 8 |
...
I need to create one table from these two that would return the spend, the number of sign-ups, and the cost per sign-up by date and campaign and I'm at a loss of how to proceed. I tried to do joins, but it doesn't seem like they are the right solution. I know that it's possible to combine two tables that don't have a common column using union/union all & cross joins, but I feel like that won't be helpful here either. Thank you in advance for any suggestions.
EDIT: As suggested in the comments I re-tried this inner join:
Select distinct s.date, s.campaign, count(s.customer_id), sum(a.spend), sum(a.spend)/count(s.customer_id) as 'cost'
from signups_from_dig as s
inner join ad_spend as a
on a.date=s.date
group by s.date, s.campaign;
The code appears to double count entries, but I used "distinct" in front of count seemed to fix it partially. However, using distinct for sum doesn't seem to work (it makes it only count once) and I'm not sure how to fix that.
CodePudding user response:
Aggregate the tables by campaign and date and then join the results. In standard SQL that would be:
select date, campaign, sp.total_spent, si.total_signups
from
(
select date, campaign, total_spent
from ad_spend
group by date, campaign
) sp
full outer join
(
select date, campaign, count(*) as total_signups
from signups_from_ad
group by date, campaign
) si using (date, campaign)
order by date, campaign;
MySQL, however does not support full outer joins. If it is guaranteed for every date-campaign pair that has a signup to also have a ad_spend row, then just turn the full join into a left outer join. If this is not the case then get the distinct date-campaign pairs first and then left outer join the aggregated tables:
select date, campaign, sp.total_spent, si.total_signups
from
(
select date, campaign from ad_spend
union
select date, campaign from signups_from_ad
) pairs
left outer join
(
select date, campaign, total_spent
from ad_spend
group by date, campaign
) sp using (date, campaign)
full outer join
(
select date, campaign, count(*) as total_signups
from signups_from_ad
group by date, campaign
) si using (date, campaign)
order by date, campaign;
CodePudding user response:
If I understand your needs correctly, you need a query like this
SELECT
*,
spend/signups AS cost
FROM (
SELECT
s.date,
s.campaign,
COUNT(s.customer_id) AS signups,
(SELECT SUM(ad.spend) FROM ad_spend ad WHERE ad.date = s.date AND ad.campaign = s.campaign) AS spend
FROM signups_from_dig s
GROUP BY s.date, s.campaign
) t;
It produces
date | campaign | signups | spend | cost |
---|---|---|---|---|
1/1/2021 | C | 3 | 1000 | 333.3333 |
1/1/2021 | D | 1 | 1000 | 1000.0000 |
1/2/2021 | C | 1 | 1000 | 1000.0000 |
1/2/2021 | D | 2 | 500 | 250.0000 |
1/3/2021 | D | 1 | 500 | 500.0000 |
You can check a working demo here