I have a transaction table with transaction_id and customer_id and date of transaction as transaction_date. I want to find distribution of monthly frequency of customers for the year 2022. Any help would be appreciated as I am new to data science and analytics.
CodePudding user response:
Are you looking for something like this:
SELECT MONTH([date])
,COUNT(customer_id) -- customers per month
,COUNT(DISTINCT customer_id) -- unique customers per month
FROM [transaction]
WHERE YEAR([date]) = 2022
GROUP BY MONTH([date])
the above will give you the total of customers per months. And if you need to drill down and find the transactions for each customer per month:
SELECT customer_id
,MONTH([date])
,COUNT(transaction_id)
FROM [transaction]
WHERE YEAR([date]) = 2022
GROUP BY customer_id
CodePudding user response:
To find the distribution of monthly frequency of customers for the year 2022, you can use the following steps:
- Extract the year and month from the transaction_date column using the DATE_TRUNC or DATE_EXTRACT function.
- Filter the table to only include transactions that occurred in 2022 using a WHERE clause.
- Group the remaining transactions by customer_id and the extracted month.
- Use the COUNT() function to count the number of transactions per customer per month.
- Use the GROUP BY clause to group the results by the extracted month.
- Use the COUNT() function again to count the number of customers that had transactions in each month.
Sample code:
WITH transactions_2022 AS (
SELECT customer_id,
DATE_TRUNC('month', transaction_date) AS month,
COUNT(transaction_id) as frequency
FROM transactions
WHERE DATE_TRUNC('year', transaction_date) = '2022'
GROUP BY 1,2
)
SELECT month,
COUNT(DISTINCT customer_id) as customer_count,
SUM(frequency) as transaction_count
FROM transactions_2022
GROUP BY 1