Home > OS >  Retrieve Customers with a Monthly Order Frequency greater than 4
Retrieve Customers with a Monthly Order Frequency greater than 4

Time:05-21

I am trying to optimize the below query to help fetch all customers in the last three months who have a monthly order frequency 4 for the past three months.

Customer ID Feb Mar Apr
0001 4 5 6
0002 3 2 4
0003 4 2 3

In the above table, the customer with Customer ID 0001 should only be picked, as he consistently has 4 or more orders in a month.

Below is a query I have written, which pulls all customers with an average purchase frequency of 4 in the last 90 days, but not considering there is a consistent purchase of 4 or more last three months.

Query:

SELECT distinct lines.customer_id Customer_ID, (COUNT(lines.order_id)/90) PurchaseFrequency
from fct_customer_order_lines lines
LEFT JOIN product_table product
ON lines.entity_id= product.entity_id
AND lines.vendor_id= product.vendor_id
WHERE LOWER(product.country_code)= "IN"
AND lines.date >= DATE_SUB(CURRENT_DATE() , INTERVAL 90 DAY )
AND lines.date < CURRENT_DATE()
GROUP BY Customer_ID
HAVING PurchaseFrequency >=4;

I tried to use window functions, however not sure if it needs to be used in this case. Your quick response is highly appreciated!

CodePudding user response:

I would sum the orders per month instead of computing the avg and then retrieve those who have that sum greater than 4 in the last three months.

Also I think you should select your interval using "month(CURRENT_DATE()) - 3" instead of using a window of 90 days. Of course if needed you should handle the case of when current_date is jan-feb-mar and in that case go back to oct-nov-dec of the previous year.

I'm not familiar with Google BigQuery so I can't write your query but I hope this helps.

CodePudding user response:

So I've found the solution to this using WITH operator as below:

WITH filtered_orders AS (
   select
  distinct customer_id ID,
  extract(MONTH from date) Order_Month,
  count(order_id) CountofOrders
  from customer_order_lines` lines
  where EXTRACT(YEAR FROM date) = 2022 AND EXTRACT(MONTH FROM date) IN (2,3,4)
  group by ID, Order_Month
  having CountofOrders>=4)
select distinct ID
from filtered_orders 
group by ID
having count(Order_Month) =3;

Hope this helps!

CodePudding user response:

An option could be first count the orders by month and then filter users which have purchases on all months above your threshold:

WITH ORDERS_BY_MONTH AS (
    SELECT 
        DATE_TRUNC(lines.date, MONTH) PurchaseMonth,
        lines.customer_id Customer_ID, 
        COUNT(lines.order_id) PurchaseFrequency
    FROM fct_customer_order_lines lines
    LEFT JOIN product_table product
        ON lines.entity_id= product.entity_id
        AND lines.vendor_id= product.vendor_id
    WHERE LOWER(product.country_code)= "IN"
        AND lines.date >= DATE_SUB(CURRENT_DATE() , INTERVAL 90 DAY )
        AND lines.date < CURRENT_DATE()
    GROUP BY PurchaseMonth, Customer_ID
)
SELECT 
    Customer_ID,
    AVG(PurchaseFrequency) AvgPurchaseFrequency
FROM ORDERS_BY_MONTH
GROUP BY Customer_ID
HAVING COUNT(1) = COUNTIF(PurchaseFrequency >= 4)
  • Related