Home > Software engineering >  SQL request with multiple joins condition on a list of transactions (postgreSQL)
SQL request with multiple joins condition on a list of transactions (postgreSQL)

Time:10-02

I have a transactions table with a this structure of data, showing the most important fields to keep in mind. It's a list of transactions, where type=1 transactions are account deposit, while type=2 are account withdrawals.

login | type | value | date.     
1234  |   1  | 100   | 25/09/2021
1234. |   2. | 250.  | 26/09/2021
4321. |   2. | 234.  | 13/09/2021
4321. |   1. | 342.  | 14/08/2021
...
...

What I'm trying to get is the list of the accounts, where their balance during the period is bigger than > some amount, X, and no active deposits after 16/09/2021.

SELECT  t.date, 
a.login, t.account, sum(d.value) as deposits, sum(w.value)  as withdrawals,  sum(d.value) - sum(w.value) as balance
FROM b.transactions AS t 
    INNER JOIN b.accounts as a ON t.account=a.id
    INNER JOIN b.transactions AS d ON t.account = d.account and d.date=t.date and d.type=t.type
    INNER JOIN b.transactions AS w ON t.account = w.account and w.date=t.date and w.type=t.type
WHERE 
d.value - w.value > 5000 and 
d.type = '1' and 
w.type = '2' and 
d.date<='2021-09-29' and d.date>='2021-09-29' and
w.date<='2021-09-29' and w.date>='2021-09-29' and 
t.date<='2021-09-29' and t.date>='2021-09-29' and 
t.account not in 
(
SELECT  t.account
FROM b.transactions AS t 
where t.type  in (1) and 
t.date>='2021-09-16' and 
t.value>0
group by t.account

)

I'm getting some output, but it looks like seriously underestimated (at least 10x- fold)... Can not find the mistake, where should I look? Thanks in advance..

CodePudding user response:

You may be overthinking this — I don't think it is necessary to join the same table three times:

SELECT t.date, 
       a.login,
       t.account,
       sum(CASE WHEN t.type = '1'
                THEN t.value
                WHEN t.type = '2'
                THEN -t.value
           END) AS balance
FROM b.transactions AS t 
    INNER JOIN b.accounts AS a
        ON t.account = a.id
WHERE t.date BETWEEN '2021-09-29' AND '2021-09-29' 
GROUP BY t.account
HAVING sum(CASE WHEN t.type = '1'
                THEN t.value
                WHEN t.type = '2'
                THEN -t.value
           END) > 5000;

CodePudding user response:

First, to compute the running balance you can use a window function. For example:

select
  login, date, type, value,
  sum(
    case when type = 1 then value when type = 2 then -value end
  ) over(partition by login order by date) as running_balance
from t
order by login, date;

Result:

 login  date                      type  value  running_balance 
 ------ ------------------------- ----- ------ --------------- 
 1234   2021-09-25T00:00:00.000Z  1     100    100             
 1234   2021-09-26T00:00:00.000Z  2     250    -150            
 4321   2021-09-13T00:00:00.000Z  2     234    -234            
 4321   2021-09-14T00:00:00.000Z  1     342    108             

See running example at DB Fiddle - Simple Running Balance.

Then you can use the logic above to compute more complex values so you can use them to filter data, as in:

select distinct login
from (
  select
    login, date, type, value,
    sum(
      case when type = 1 then value when type = 2 then -value end
    ) over(partition by login order by date) as running_balance,
    sum(
      case when type = 1 and date > date '2021-09-16' then 1 else 0 end
    ) over(partition by login order by date) as extra_deposits
  from t
) x
where running_balance >= 100 and extra_deposits = 0

Result:

login
-----
4321

See running example at DB Fiddle - Full Query.

  • Related