Home > Software design >  SQL Server - How can I query to return products only when their sales exceeds a certain percentage?
SQL Server - How can I query to return products only when their sales exceeds a certain percentage?

Time:10-17

The basic requirement is this: We capture sales by day of week and product. If more than half* of the day's sales came from one product, we want to capture that. Else we show "none".

So image we sell shoes, pants and shirts. On Monday, we sold $100 of each. So it was a three way split, and each category accounted for 33.3% of sales. We show "none". On Tuesday though, half of our sales came from shoes, and on Wednesday, 80% from shirts. So we want to see that.

The query below returns the desired result, but I'm not a fan of a queries within queries within queries. They can be inefficient and hard to read, and I feel like there's a cleaner way. Can this be improved upon?

*The requirement for half will be a parameter (@threshold here). In some cases, we might want to show only when it's 75% or more of sales. Obviously that parameter has to be >= 50%.

declare @sales as table (day_of_week varchar(16), product varchar(8), sales_amt int)
insert into @sales values ('monday', 'shoes', 100)
insert into @sales values ('monday', 'pants', 100)
insert into @sales values ('monday', 'shirts', 100)
insert into @sales values ('tuesday', 'shoes', 500)
insert into @sales values ('tuesday', 'pants', 300)
insert into @sales values ('tuesday', 'shirts', 200)
insert into @sales values ('wednesday', 'shoes', 100)
insert into @sales values ('wednesday', 'pants', 100)
insert into @sales values ('wednesday', 'shirts', 800)

declare @threshold as decimal(3,2) = 0.5

select day_of_week, case when pct_of_day >= @threshold then product else 'none' end half_of_sales from (
    select day_of_week, product, pct_of_day, row_number() over (partition by day_of_week order by pct_of_day desc) _rn
    from (
        select day_of_week, product, sum(sales_amt) * 1.0 / sum(sum(sales_amt)) over (partition by day_of_week) pct_of_day
        from @sales
        group by day_of_week, product
    ) x
) z 
where _rn = 1

CodePudding user response:

maybe a little easier to read?

DECLARE @threshold AS decimal(3, 2) = 0.5;

WITH ssum
AS (SELECT
  day_of_week,
  SUM(sales_amt) sa
FROM @sales
GROUP BY day_of_week)

SELECT
  s.day_of_week,
  MAX(CASE WHEN s.sales_amt * 1.0 / ssum.sa >= @threshold THEN s.product ELSE 'none' END) threshold
FROM ssum
INNER JOIN @sales AS s
  ON ssum.day_of_week = s.day_of_week
GROUP BY s.day_of_week

CodePudding user response:

Firstly, you can place the nested queries in CTEs, which can make them easier to read. It won't make them more efficient, but then nested queries are not necessarily inefficient in themselves, not sure why you think so

Second, the query could be optimized, because the row-numbering is equally valid on the non-percentaged sum(sales_amt) value, so it can be on the same level as the windowed sum over

declare @threshold as decimal(3,2) = 0.5;

with GroupedSales as (
    select
      day_of_week,
      product,
      sum(sales_amt) * 1.0 / sum(sum(sales_amt)) over (partition by day_of_week) pct_of_day,
      row_number() over (partition by day_of_week order by sum(sales_amt) desc) _rn
    from @sales
    group by
      day_of_week,
      product
)
select
  day_of_week,
  case when pct_of_day >= @threshold
    then product
    else 'none'
  end half_of_sales
from GroupedSales
where _rn = 1;
  • Related