Home > OS >  Calculate lift for every row and compare it with the average lift of region and year
Calculate lift for every row and compare it with the average lift of region and year

Time:07-02

I have a table that looks like the one below:

Shop Year Region Waste Avg Waste (Year,Region) Lift Column_I_want_To_Calculate (apply case when statements) CASE WHEN Lift > Avg(Lift) OVER (PARTITION BY YEAR, REGION) THEN 1 ELSE 0 END
a 2021 CA 10 15 =>(10 20)/2 0.67 => 10/15 0.67 < (0.67 1.34)/2 = 1.005 THEN 0
b 2021 CA 20 15=> (10 20)/2 1.34 => 20/15 1.34 > (0.67 1.34)/2 = 1.005 THEN 1
c 2021 FL 8 8 => 8/1 8/8 8 = 8 THEN 0
d 2020 LA 25 22 => (25 19)/2 0.88 => 25/22 0.88 > (0.88 0.87)/2 = 0.875 THEN 1
e 2020 LA 19 22 => (25 19)/2 0.87 => 19/22 0.87 < (0.88 0.87)/2 = 0.875 THEN 0
f 2019 NY 35 35 35/35 35 = 35 THEN 0

So far I have calculated the columns Shop, Year, Region, Waste, Avg Waste (Year, Region), Lift. I want to calculate the one marked as Column_I_want_To_Calculate.

Briefly, it computes the average lift per Region and Year and compares Shops' Lift with the Average Lift of all shops in the same Region and Year. Then assigns the value 1 or 0 in case of a greater than statement.

So far I have tried (PostgreSQL),

SELECT  shop
        ,year
        ,region
        ,waste
        ,AVG(waste) over (partition by year, region) as "Avg Waste (Year,Region)"
        ,waste/avg(waste) over (partition by year, region) AS Lift,
        ,CASE WHEN waste/avg(waste) over (partition by year, region) > 
           (SELECT tab2.avg_lift 
            FROM (
              SELECT tab1.year, tab1.region, AVG(tab1.lift) OVER (PARTITION BY tab1.year, tab1.region) avg_lift
              FROM (
                  SELECT year, region, waste/ avg(waste) over (partition by year, region) AS lift
                  FROM main_table
                  GROUP BY year,region,waste
                  ORDER BY lift DESC
              ) tab1
              GROUP BY tab1.year, tab1.region, tab1.lift
           ) tab2
        ) THEN 1 ELSE 0 END AS "Column_I_want_To_Calculate"
FROM main_table
GROUP BY shop,
         year,
         nomos,
         waste
;

However, the code above throws the exception

postgresql error: more than one row returned by a subquery used as an expression

CodePudding user response:

This one returns the required output based on your input:

SELECT  
        region
    ,   shop
    ,   waste
    ,   round(AVG(waste) OVER w,2) AS avg_waste
    ,   round(waste / AVG(waste) OVER w,2) AS lift
    ,   CASE
            WHEN waste > AVG(waste) OVER w THEN 1
            ELSE 0
        END AS above_average
FROM    i
WINDOW  w AS (PARTITION BY year, region)
ORDER BY
    1,2,3;
  • Related