I need to calculate the median transaction amount for each customer in the past 52 weeks, but percentile_cont returns NULL if there's no transaction for a particular customer. In such a case, I have to replace NULL with zero, I acheived this by using a CASE clause in sql, however I am using PERCENTILE_CONT twice for this purpose which makes the query slow for a huge list of customers to process. is there a better way to use the PERCENTILE_CONT only once inside the CASE clause?
SELECT DISTINCT customer,
CASE WHEN
PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY
transamt) OVER
(PARTITION BY
customer) IS NOT NULL THEN
PERCENTILE_CONT(0.5) WITHIN
GROUP (ORDER BY transamt)
OVER (PARTITION BY
customer)
ELSE 0
END AS median_amt
FROM trans_table
WHERE trans_date BETWEEN DATEADD(WEEK, -52, GETDATE() ) AND GETDATE()
CodePudding user response:
According to "...percentile_cont returns NULL if there's no transaction for a particular customer...", it could be a data issue or by design there could be entries for a customer without any transactions could have trans_date
in the past 52 weeks but transamt
is null. If that's the case, maybe this work for you by changing order by transamt
to the following:
select distinct
customer,
percentile_cont(0.5) within group
(order by case when transamt is null then 0 else transamt end)
over (partition by customer) as median_amt
from trans_table;
Second guess: if percentile_cont() return NULL then show 0. Using coalesce().
select distinct customer,
coalesce(percentile_cont(0.5) within group (order by transamt) over (partition by customer),0) as median_amt
from trans_table;