Home > Mobile >  How to select column values of a group based on another column value
How to select column values of a group based on another column value

Time:11-03

I got a table like this one

CaseID NAME ADDRESS ZIP ROLE
1 Joe address_1 zip_1 role_1
1 John address_2 zip_2 role_1
1 Jane address_3 zip_3 role_1
1 Bill address_4 zip_4 role_1
1 Bill address_5 zip_5 role_2
2 Bob address_6 zip_6 role_1
2 Shawn address_7 zip_7 role_1

I would like to group by the name and CaseID, making a list of the roles in each group. That part is easy. The tricky part is that as you can see for Bill, we have two different addresses and zip. I tried to keep only one with a Max or Min aggregation function inside the group, but there might be inconsistency in the resulting address, keeping zip of one row and the address of the other raw. How can I fetch the zip and address of the same row (which ever) in a group and listing all the roles. I'd like a result like

CaseID NAME ADDRESS ZIP ROLE
1 Joe address_1 zip_1 role_1
1 John address_2 zip_2 role_1
1 Jane address_3 zip_3 role_1
1 Bill address_4 zip_4 role_1, role_2
2 Bob address_6 zip_6 role_1
2 Shawn address_7 zip_7 role_1

or

CaseID NAME ADDRESS ZIP ROLE
1 Joe address_1 zip_1 role_1
1 John address_2 zip_2 role_1
1 Jane address_3 zip_3 role_1
1 Bill address_5 zip_5 role_1, role_2
2 Bob address_6 zip_6 role_1
2 Shawn address_7 zip_7 role_1

CodePudding user response:

In Oracle that's a good use case for the keep syntax to aggregate functions:

select caseid, name, 
    min(address) keep(dense_rank first order by address) address,
    min(zip)     keep(dense_rank first order by address) zip,
    listagg(role, ', ') within group (order by role) role
from mytable
group by caseid, name

The order by clause of keep lets you consistently "pick" a row in the group - that whose address comes first when sorted against others in the group ; we can repeat the expression across columns to get the zip of the same row.

Note that the query would benefit a more stable ordering criteria (such as order by address_id maybe?). Else, from which row should the zip be kept when there are duplicate addresses?

Demo on DB Fiddle

CodePudding user response:

Here's one option; read comments within code.

Sample data:

SQL> with test (caseid, name, address, zip, role) as
  2    (select 1, 'Joe' , 'address_1', 'zip_1', 'role_1' from dual union all
  3     select 1, 'John', 'address_2', 'zip_2', 'role_1' from dual union all
  4     select 1, 'Bill', 'address_4', 'zip_4', 'role_1' from dual union all
  5     select 1, 'Bill', 'address_5', 'zip_5', 'role_2' from dual union all
  6     select 2, 'Bob' , 'address_6', 'zip_6', 'role_1' from dual
  7    ),

Query begins here:

  8  temp as
  9    -- RN will later be used in correlated subqueries to fetch
 10    -- ADDRESS and ZIP values that belong to the same row
 11    (select caseid, name, address, zip, role,
 12       row_number() over (partition by caseid, name order by address) rn
 13     from test
 14    )
 15  -- finally, fetch what you can from the TEMP CTE; use correlated subqueries
 16  -- for the rest and aggregate ROLE values
 17  select a.caseid,
 18    a.name,
 19    --
 20    (select b.address
 21     from temp b
 22     where b.caseid = a.caseid
 23       and b.name = a.name
 24       and b.rn = 1
 25    ) address,
 26    --
 27    (select b.zip
 28     from temp b
 29     where b.caseid = a.caseid
 30       and b.name = a.name
 31       and b.rn = 1
 32    ) zip,
 33    --
 34    listagg(role, ', ') within group (order by role) role
 35  from temp a
 36  group by a.caseid, a.name
 37  /

Result:

    CASEID NAME ADDRESS   ZIP   ROLE
---------- ---- --------- ----- ---------------
         1 Joe  address_1 zip_1 role_1
         1 Bill address_4 zip_4 role_1, role_2
         1 John address_2 zip_2 role_1
         2 Bob  address_6 zip_6 role_1

SQL>

CodePudding user response:

S.t. like

select 
   caseid, name, 
   min(address) keep (dense_rank first order by address, zip), 
   min(zip) keep (dense_rank first order by address, zip), 
   listagg(role)
from <crappy table> 
group by caseid, name
  • Related