Home > database >  SQL Inner join with sum and null value
SQL Inner join with sum and null value

Time:07-29

The table below is an extract of a larger set of data enter image description here

In my scenario Column 2 is null when is the "parent" record (Column 1 = AB1 and Column 2 is NULL) and as you can see the following 2 "child" records under Column 2 have AB1 as identifier which matches the AB1 from Column 1, what I want to do is to sum the values on Column 3 when Column 2 has the same identifier (AB1), up to this point the sum = 29 (for this case I can do a SUM and group by AB1). My issue arises when I need to add the value of 10 in Column 3 when column 2 is NULL and Column 1 is AB1 (parent identifier). The common identifier is AB1 but for the parent record the identifier is in Column 1 instead of Column 2. I need a SQL that return a total sum of 39.

Edit:

Thanks for the prompt responses, my apologies I think my question was not clear enough. I am using MS SQL Server Management Studio The goal for the query to sum the amounts on Column 3 by grouping by the records on Column 2 that have the same identifier (AB1) and then find that same identifier on Column 1 (AB1) and also add that value to the total sum. The query below is doing the group by Column 2 correctly because for example if I have 10 records with the identifier AB1 it is returning one row with the sum of the amounts on Column 3, the issue is that I also need to add to that sum when the identifier AB1 is also in Column 1.

select t1.Column1 , round(sum (t1.Column3),2) as Total from table t1, table t2 where
and t1.Column2 = t2. Column1 group by t1. Column2

Basically this table stores transactions and the initial transaction “parent” is in Column 1 (AB1) and all other transactions “children” linked to the parent transaction have that identifier (AB1) but in Column 2. Column 1 is a unique identifier and does not repeat and then is the “parent” transaction it is NULL on Column 2 but that identifier (AB1) can be repeated multiple times in Column 2 depending all the “children” transactions that are linked to the “parent”.

CodePudding user response:

select sum(Column3)
from TheTable
where 'AB1' in (Column1, Column2);

will sum the value of Column3 for the parent (Column1 = 'AB1') and the children (Column2 = 'AB1').

If the parent-child hierarchy has more than two levels, and you want to sum Column3 for grandchildren, grand-grandchildren, and so on, you can use a hierarchical query (also known as a recursive query). The exact syntax depends on your database, this is for PostgreSQL:

with recursive Hier(Column1, Column2, Column3) as
(
  select Column1, Column2, Column3
  from TheTable
  where Column1 = 'AB1'
union all
  select t.Column1, t.Column2, t.Column3
  from TheTable t
  join Hier h on t.Column2 = h.Column1
)
select sum(Column3)
from Hier;

CodePudding user response:

Oracle
The WITH clause is here just to generate sample data and, as such, it is not the part of the answer.
I don't know what is the expected result, but the Totals could be calculated using Union All (without Inner Join)

WITH
    tbl AS
        (
            Select 'AB1' "COL_1", Null  "COL_2", 10 "COL_3" From Dual Union All
            Select 'CD2' "COL_1", 'AB1' "COL_2", 15 "COL_3" From Dual Union All
            Select 'EF3' "COL_1", 'AB1' "COL_2", 14 "COL_3" From Dual 
        )
SELECT
    ID, Sum(TOTAL) "TOTAL"
FROM
    (
        SELECT COL_1 "ID", Sum(COL_3)  "TOTAL" FROM tbl GROUP BY COL_1 UNION ALL
        SELECT COL_2 "ID", Sum(COL_3)  "TOTAL" FROM tbl GROUP BY COL_2
    )
WHERE ID Is Not Null
GROUP BY ID
ORDER BY ID
--
--  R e s u l t
--  
ID       TOTAL
--- ----------
AB1         39 
CD2         15 
EF3         14

It is a Sum() Group By aggregation, but the same result gives Sum() analytic function with DISTINCT keyword.

SELECT DISTINCT
    ID, Sum(TOTAL) OVER(PARTITION BY ID ORDER BY ID) "TOTAL"
FROM
    (
        SELECT COL_1 "ID", Sum(COL_3)  "TOTAL" FROM tbl GROUP BY COL_1 UNION ALL
        SELECT COL_2 "ID", Sum(COL_3)  "TOTAL" FROM tbl GROUP BY COL_2
    )
WHERE ID Is Not Null
--
--  R e s u l t
--  
ID       TOTAL
--- ----------
AB1         39 
CD2         15 
EF3         14

And if you need Inner Join then the answer is below. Note that there is only ID which actually has children. That is because of the Inner Join. Regards...

SELECT 
    t1.COL_1 "ID", 
    Max(t1.COL_3)   Sum(t2.COL_3) "TOTAL"
FROM
    tbl t1
INNER JOIN
    tbl t2 ON (t2.COL_2 = t1.COL_1)
GROUP BY t1.COL_1
ORDER BY t1.COL_1
--
--  R e s u l t
--  
ID       TOTAL
--- ----------
AB1         39 

CodePudding user response:

You can spilt the two sets of data then union them together. From there it will be a simple sum group by.

To do this we simply saying take Column1 as the Parent if Column2 IS null take Column2 as the Parent if Column2 IS not null

Select Column1 as Parent, Column3 
from TheTable
where Column2 IS null
Union
Select Column2 as Parent, Column3
from TheTable
where Column2 IS not null

From there you can use this as a cte

WITH data AS
(
Select Column1 as Parent, Column3 
from TheTable
where Column2 IS  null
Union
Select Column2 as Parent, Column3
from TheTable
where Column2 IS not null)

Select Parent,  Sum(Column3)
from Data 
Group by Parent

Result will be

Parent  SumColumn3
AB1     39
  • Related