Home > Software engineering >  Postgres SQL Grouping By column value inside of Case statment
Postgres SQL Grouping By column value inside of Case statment

Time:11-17

t1:

AccountName Date Amount
A1 2022-06-30 2
A2 2022-06-30 1
A3 2022-06-30
A1 2022-07-31 4
A2 2022-07-31 5
A3 2022-07-31

I want to do a transformation on this table such that I fill in the "Amount" column of all rows with account name 'A3' and lets say that for each month group the 'A3' -"Amount" value is equal to (the 'A1' 'Amount' column the 'A2' 'Amount' column), so the expected result table is:

AccountName Date Amount
A1 2022-06-30 2
A2 2022-06-30 1
A3 2022-06-30 3
A1 2022-07-31 4
A2 2022-07-31 5
A3 2022-07-31 9

The only way I can think of solving this is using multiple CTE's to separate each 'Date' value and using a case statements with multiple selects to get these values the using a union at the end:

with d1 as (
select * 
from t1 
WHERE Date = '2022-06-30'),

c1 as (
SELECT 
"AccountName", 
"Date",
Case WHEN "AccountName" = 'A3' 
THEN (SELECT "Amount" FROM t1 WHERE "AccountName" = 'A1')   
(SELECT "Amount" FROM t1 WHERE AccountName = 'A2')
ELSE "Amount" END AS "Amount"
FROM d1),

d2 as (
select * 
from t1 
WHERE Date = '2022-07-31'),

c2 as (
SELECT 
"AccountName", 
"Date",
Case WHEN "AccountName" = 'A3' 
THEN (SELECT "Amount" FROM t1 WHERE "AccountName" = 'A1')   
(SELECT "Amount" FROM t1 WHERE AccountName = 'A2')
ELSE "Amount" END AS "Amount"
FROM d2) 

SELECT * FROM c1
Union
SELECT * FROM c2

Is a better way of doing this? As i have multiple row calculations based on other row values and on top of that multiple Distinct 'Date' values (24) for which i would have to create separate CTE's for. This would result in an extremely long sql script for me. Is there maybe a way to group by every 'Date' value in the date column to avoid making multiple CTE's for each 'Date' Value? Additionally is there a better way to construct the sums values for the 'Amount' values for all 'A3' rows rather that using multiple selects in side each 'CASE WHEN'? Thanks!

CodePudding user response:

You can use a window function for this - no need to hardcode the dates:

SELECT 
  "AccountName",
  "Date",
  (CASE WHEN "AccountName" = 'A3'
    THEN SUM("Amount") FILTER (WHERE "AccountName" IN ('A1', 'A2')) OVER (PARTITION BY "Date")
    ELSE "Amount"
  END) AS "Amount"
FROM t1

An equivalent query using subqueries would be

SELECT 
  "AccountName",
  "Date",
  (CASE WHEN "AccountName" = 'A3'
    THEN (
      SELECT SUM("Amount")
      FROM t1
      WHERE "Date" = outer."Date"
        AND "AccountName" IN ('A1', 'A2')
    )
    ELSE "Amount"
  END) AS "Amount"
FROM t1 outer

or, assuming that the amounts of A1 and A2 are never NULL:

SELECT 
  "AccountName",
  "Date",
  (CASE WHEN "AccountName" = 'A3'
    THEN (
      SELECT "Amount"
      FROM t1
      WHERE "Date" = outer."Date"
        AND "AccountName" = 'A1'
    )   (
      SELECT "Amount"
      FROM t1
      WHERE "Date" = outer."Date"
        AND "AccountName" = 'A2'
    )
    ELSE "Amount"
  END) AS "Amount"
FROM t1 outer

CodePudding user response:

This is how I would do it I think. The amount of rows without one becomes the sum of amounts in the same calendar month. I already combined month and year into a string, but it's probably way more efficient to compare the year and month value seperately but I like how this looks in joins.

update t1 t1update
set amount = (
 select sum(amount) from t1 
 where 
   extract(year from t1update.date date) || '-' || extract(month from t1update.date = 
   extract(year from t1.date date) || '-' || extract(month from t1.date)
)
where t1update.amount = '' or  t1update.amount is null
  • Related