I have a table of transactions from a marketplace. It has three fields: buyer_email, seller_email, date.
I would like to know who are the most active buyers and sellers, assuming that buyers can be sellers and that sellers can be buyers. By "most active" I mean the users that have made the most transactions in the last N days - whether they are buyers or sellers.
I wrote this query to get the most active buyers:
SELECT buyer_email, COUNT(buyer_email) AS number_of_purchases
FROM table
GROUP BY buyer_email
ORDER BY COUNT(buyer_email) DESC
The results look like this:
| buyer_email | number_of_purchases |
| -------------------------------------- | -------------------------- |
| [email protected] | 74 |
| [email protected] | 42 |
| [email protected] | 31 |
| [email protected] | 19 |
And I wrote another query to get the list of most active sellers:
SELECT seller_email, COUNT(seller_email) AS number_of_sales
FROM table
GROUP BY seller_email
ORDER BY COUNT(seller_email) DESC
The results of which look like this:
| seller_email | number_of_sales |
| ---------------------------------- | ---------------------- |
| [email protected] | 156 |
| [email protected] | 89 |
| [email protected] | 23 |
| [email protected] | 12 |
I would like to combine both query results to get something like this:
| user_email | number_of_sales | number_of_purchases | total |
| ------------------------ | ------------------- | ------------------- | -------- |
| [email protected] | 156 | 31 | 187 |
| [email protected] | 89 | 19 | 108 |
| [email protected] | 12 | 74 | 86 |
| [email protected] | 23 | 42 | 65 |
However, there are some things to take into account:
The cardinality of both sets, buyers and sellers, is not the same.
There are buyers that aren't sellers, and sellers that aren't buyers. The number_of_sales for the former would be 0, and the number_of_purchases for the latter would be 0 too. This is tricky, as the GROUP BY clause doesn't group by 0-sized groups.
What I've tried:
Using a JOIN statement ON seller_email = buyer_email, but this gives me as a results the rows where the seller and the buyer are the same in a given transaction - people who sell something to themselves.
Experimenting with UNION, but failing to get anything relevant.
I'm not sure if that's clear, but if anyone could help me achieve the aforementioned result, that would be great.
CodePudding user response:
Use FULL OUTER JOIN.
WITH
buyer_data AS (
SELECT
buyer_email AS email,
COUNT(buyer_email) AS number_of_purchases
FROM table
GROUP BY
buyer_email
ORDER BY
COUNT(buyer_email) DESC
),
seller_data AS (
SELECT
seller_email AS email,
COUNT(seller_email) AS number_of_sales
FROM table
GROUP BY
seller_email
ORDER BY
COUNT(seller_email) DESC
)
SELECT
COALESCE(buyer_data.email, seller_data.email) AS email,
COALESCE(number_of_purchases, 0) AS number_of_purchases,
COALESCE(number_of_sales, 0) AS number_of_sales,
COALESCE(number_of_purchases, 0) COALESCE(number_of_sales, 0) AS total
FROM buyer_data
FULL OUTER JOIN seller_data ON seller_data.email = buyer_data.email
CodePudding user response:
Consider below approach
select *, purchase sale as total from (
select buyer_email as user_email, 'purchase' type from your_table
union all select seller_email, 'sale' from your_table
)
pivot (count(*) for type in ('purchase', 'sale'))