Home > Enterprise >  check for same quantity after applying GROUP BY (tricky scenario)
check for same quantity after applying GROUP BY (tricky scenario)

Time:02-23

I am trying to check whether the output for a specific account number and for the different organizations is the same and show the discrepancies in Oracle

Imagine i have a table called myTable and it contains 2 fields: accountnumber and org_id

There exist only 3 org_id: 81, 281 and 404

image

Dataset:

|accountnumber|org_id|
|4435354      |81    |
|4435354      |281   |
|4435354      |404   |
|3333333      |81    |
|3333333      |281   |
|4444444      |81    |
|4444444      |81    |
|4444444      |281   |
|4444444      |404   |

I want to find all the different accounts that dont have the exact amount for each org_id

For example:

account 4435354 has 1 row for org_id 81, 1 row for org_281 and 1 row for org_id 404, so in this case it is correct

account 3333333 has 1 row for org_id 81, 1 row for org_281 and none for 404 so there exist a discrepancy

account 4444444 has 2 rows for org_id 81, 1 row for org_281 and 1 row for org_id 404, so there exist a discrepancy

DESIRED OUTPUT:

ACCOUNTNUMBER| ORG_ID|COUNT(*)
|3333333      |  81   |1
|3333333      |  281  |1
|3333333      |  404  |0
|4444444      |  81   |2
|4444444      |  281  |1
|4444444      |  404  |1

How can i achieve something like that in Oracle?

CodePudding user response:

You already asked almost the same question here: Find the discrepancies in sql (tricky situation) (And you still haven't chosen a right answer there)

Just add count(*) or count(distinct org_id): DBFiddle

select
  accountnumber
  ,count(org_id) cnt
  ,count(distinct org_id) cnt_distinct
  ,listagg(org_id,',')within group(order by org_id) orgs
  ,listagg(distinct org_id,',')within group(order by org_id) orgs_distinct
from mytable
group by accountnumber;

Results:

ACCOUNTNUMBER        CNT CNT_DISTINCT ORGS          ORGS_DISTINCT
------------- ---------- ------------ ------------- --------------
      3333333          2            2 81,281        81,281
      4435354          3            3 81,281,404    81,281,404
      4444444          4            3 81,81,281,404 81,281,404

And if you really need all rows separately: DBFiddle

select
  accountnumber
  ,org_id
  ,count(org_id)          over(partition by accountnumber) cnt
  ,count(distinct org_id) over(partition by accountnumber) cnt_distinct
  ,listagg(org_id,',')within group(order by org_id)
     over(partition by accountnumber)
       as orgs
  ,listagg(distinct org_id,',')within group(order by org_id) 
     over(partition by accountnumber)
       as orgs_distinct
from mytable;

Results:

ACCOUNTNUMBER     ORG_ID        CNT CNT_DISTINCT ORGS          ORGS_DISTINCT
------------- ---------- ---------- ------------ ------------- --------------
      3333333         81          2            2 81,281        81,281
      3333333        281          2            2 81,281        81,281
      4435354         81          3            3 81,281,404    81,281,404
      4435354        281          3            3 81,281,404    81,281,404
      4435354        404          3            3 81,281,404    81,281,404
      4444444         81          4            3 81,81,281,404 81,281,404
      4444444         81          4            3 81,81,281,404 81,281,404
      4444444        281          4            3 81,81,281,404 81,281,404
      4444444        404          4            3 81,81,281,404 81,281,404

Listagg documentation:

For a specified measure, LISTAGG orders data within each group specified in the ORDER BY clause and then concatenates the values of the measure column.

Update: you can just add a predicate to filter rows you don't want to output: DBFiddle 3

select *
from (
    select
      accountnumber
      ,org_id
      ,count(org_id)          over(partition by accountnumber) cnt
      ,count(distinct org_id) over(partition by accountnumber) cnt_distinct
      ,listagg(org_id,',')within group(order by org_id)
         over(partition by accountnumber)
           as orgs
      ,listagg(distinct org_id,',')within group(order by org_id) 
         over(partition by accountnumber)
           as orgs_distinct
    from mytable) v
where cnt<>3;

Results:

ACCOUNTNUMBER     ORG_ID        CNT CNT_DISTINCT ORGS          ORGS_DISTINCT
------------- ---------- ---------- ------------ ------------- --------------
      3333333         81          2            2 81,281        81,281
      3333333        281          2            2 81,281        81,281
      4444444         81          4            3 81,81,281,404 81,281,404
      4444444         81          4            3 81,81,281,404 81,281,404
      4444444        281          4            3 81,81,281,404 81,281,404
      4444444        404          4            3 81,81,281,404 81,281,404

CodePudding user response:

You can use:

WITH expected_orgs (org_id) AS (
  SELECT  81 FROM DUAL UNION ALL
  SELECT 281 FROM DUAL UNION ALL
  SELECT 404 FROM DUAL
)
SELECT accountnumber,
       org_id,
       cnt
FROM   (
  SELECT t.accountnumber,
         e.org_id,
         COUNT(t.org_id) AS cnt,
         MIN(COUNT(t.org_id)) OVER (PARTITION BY t.accountnumber) AS min_cnt,
         MAX(COUNT(t.org_id)) OVER (PARTITION BY t.accountnumber) AS max_cnt
  FROM   expected_orgs e
         LEFT OUTER JOIN table_name t
         PARTITION BY (t.accountnumber)
         ON (e.org_id = t.org_id)
  GROUP BY
         t.accountnumber,
         e.org_id
)
WHERE  min_cnt < 1
OR     max_cnt > 1;

Which, for the sample data:

CREATE TABLE table_name (accountnumber, org_id) AS
SELECT 4435354,  81 FROM DUAL UNION ALL
SELECT 4435354, 281 FROM DUAL UNION ALL
SELECT 4435354, 404 FROM DUAL UNION ALL
SELECT 3333333,  81 FROM DUAL UNION ALL
SELECT 3333333, 281 FROM DUAL UNION ALL
SELECT 4444444,  81 FROM DUAL UNION ALL
SELECT 4444444,  81 FROM DUAL UNION ALL
SELECT 4444444, 281 FROM DUAL UNION ALL
SELECT 4444444, 404 FROM DUAL;

Outputs:

ACCOUNTNUMBER ORG_ID CNT
3333333 81 1
3333333 281 1
3333333 404 0
4444444 81 2
4444444 281 1
4444444 404 1

db<>fiddle here

  • Related