I have one tables like below with some sample data:
Main table
center | document | documentitem | profit |
---|---|---|---|
NL100 | 100 | 00010 | 5 |
NL101 | 100 | 00010 | 10 |
NL100 | 100 | 00010 | 0 |
NL102 | 100 | 00010 | 5 |
NL200 | 200 | 00010 | 1 |
NL300 | 300 | 00010 | 6 |
NL300 | 300 | 00010 | 7 |
NL300 | 300 | 00010 | 4 |
I need to create a new column based on the following logic: The new column should be called scenario and is made up out of 3 scenario's.
- scenario 1: The document and documentitem columns keep only one record for center. If this is the case the value for Scenario should be set equal to the value of Center (e.g. for document documentitem = 200 00010 where there is only one value for Center.) In this scenario NL200 should be returned as value.
- scenario 2: The document and documentitem columns keep multiple values for Center, but they are all the same. (E.g. for Document Documentitem = 300 00010 where there are three values for center but they are all the same.) In this scenario only one of the values should be kept (doesn't matter which one, can use any aggregate function really)
- Scenario 3: this is the most tricky one. The document and documentitem column keep multiple values for Center, but they are not the same. The value that should be kept in the final table is based on the value for the Spend column. The profit should equal 0. So NL100 should be kept as final value.
The output table should look like this:
center | document | documentitem | profit |
---|---|---|---|
NL100 | 100 | 00010 | 0 |
NL200 | 200 | 00010 | 1 |
NL300 | 300 | 00010 | 4 |
Hope someone can guide me in the right direction. Thanks!
CodePudding user response:
Try the following:
Select T.center, T.document, T.documentitem, T.profit,
Case
When D.mn=D.mx And D.cn=1 Then 'Scenario 1'
When D.mn=D.mx And D.cn>1 Then 'Scenario 2'
When D.mn<>D.mx And D.mnp=0 Then 'Scenario 3'
When D.mn<> D.mx And D.mnp<>0 Then 'Undefined Scenario'
End As Scenario
From
mytable T Join
(
Select document, documentitem, COUNT(*) cn, MIN(center) mn, MAX(center) mx, MIN(profit) mnp
From mytable
Group By document, documentitem
) D
On T.document=D.document And
T.documentitem=D.documentitem And
T.profit=D.mnp
The idea is to find the MIN, MAX and COUNT
for the 'center' value, and MIN(profit)
grouped by 'document, documentitem'.
Scenario 1 occurs when MIN = MAX and COUNT = 1
.
Scenario 2 occurs when MIN = MAX and COUNT > 1
. (multiple center entries but with the same value).
Scenario 3 occurs when MIN <> MAX and MIN(profit)= 0
.
Undefined Scenario occurs when MIN <> MAX and MIN(profit) <> 0
.
See demo from db<>fiddle.
CodePudding user response:
Scenario 1. Use the COUNT
function with GROUP BY
and HAVING
clauses to get the count of center
values for the particular document
and documentitem
values and to filter rows by value of count equal to 1.
SELECT t.*
FROM t
JOIN (
SELECT
document,
documentitem
FROM t
GROUP BY document, documentitem
HAVING COUNT(center) = 1
) t2 ON t2.document = t.document
AND t2.documentitem = t.documentitem
Scenario 2. The difference with the scenario 1 is in conditions in HAVING
clause. As it's said there should be multiple rows (e.g. COUNT(center) > 1
) with the same center
value (e.g. COUNT(DISTINCT center) = 1
. Also the outer query uses the MIN
aggragation function to get one of the values of profit
and the required GROUP BY
clause.
SELECT
t.center,
t.document,
t.documentitem,
MIN(t.profit)
FROM t
JOIN (
SELECT
document,
documentitem
FROM t
GROUP BY document, documentitem
HAVING COUNT(DISTINCT center) = 1 AND COUNT(center) > 1
) t2 ON t2.document = t.document
AND t2.documentitem = t.documentitem
GROUP BY t.center, t.document, t.documentitem
Scanario 3. The difference with scenario 1 is in HAVING
clause condition. As it's said there should be multiple rows for the same document
and documentitem
values but with different center
values (e.g. COUNT(DISTINCT center) > 1
and the row with profit
equal to 0 should be returned (e.g. t.profit = 0
in ON
clause conditions of JOIN
)
SELECT
t.center,
t.document,
t.documentitem,
t.profit
FROM t
JOIN (
SELECT
document,
documentitem
FROM t
GROUP BY document, documentitem
HAVING COUNT(DISTINCT center) > 1
) t2 ON t2.document = t.document
AND t2.documentitem = t.documentitem
AND t.profit = 0
Finally use UNION ALL
operator to get all of the queries results together
SELECT t.*, 'scenario 1' scenario
FROM t
JOIN (
SELECT
document,
documentitem
FROM t
GROUP BY document, documentitem
HAVING COUNT(center) = 1
) t2 ON t2.document = t.document
AND t2.documentitem = t.documentitem
UNION ALL
SELECT
t.center,
t.document,
t.documentitem,
MIN(t.profit),
'scenario 2'
FROM t
JOIN (
SELECT
document,
documentitem
FROM t
GROUP BY document, documentitem
HAVING COUNT(DISTINCT center) = 1 AND COUNT(center) > 1
) t2 ON t2.document = t.document
AND t2.documentitem = t.documentitem
GROUP BY t.center, t.document, t.documentitem
UNION ALL
SELECT
t.center,
t.document,
t.documentitem,
t.profit,
'scenario 3'
FROM t
JOIN (
SELECT
document,
documentitem
FROM t
GROUP BY document, documentitem
HAVING COUNT(DISTINCT center) > 1
) t2 ON t2.document = t.document
AND t2.documentitem = t.documentitem
AND t.profit = 0
Output
center | document | documentitem | profit | scenario |
---|---|---|---|---|
NL200 | 200 | 00010 | 1 | scenario 1 |
NL300 | 300 | 00010 | 4 | scenario 2 |
NL100 | 100 | 00010 | 0 | scenario 3 |
db<>fiddle here