how to get individual-clinic-month that are excluded from SQL query-CodePudding

I have the following dataset:

individual  |  clinic_1   | clinic_2    | month      | address_recorded | address_code
1           |  A          | B           | 01-01-2016 | 01-02-1999       | C01
1           |  A          | A           | 01-01-2016 | 01-02-2003       | C02
1           |  A          | A           | 01-01-2016 | 01-02-2001       | C06
1           |  A          | X           | 01-01-2016 | 01-02-2000       | C03
2           |  C          | B           | 01-04-2016 | 01-02-1999       | D04
2           |  C          | A           | 01-04-2016 | 01-02-2001       | D05
2           |  C          | X           | 01-04-2016 | 01-02-2000       | D06

I would like to get:

individual  |  clinic_1   | month      | address_code
1           |  A          | 01-01-2016 | C02
2           |  C          | 01-04-2016 | D05

Criteria:

For unique set of individual-clinic_1-month with clinic_1 = clinic_2, select the most recent date in which address was recorded within clinic_1
For unique set of individual-clinic_1-month with NO instances where clinic_1 = clinic_2, select the most recent date in which address was recorded across clinics

I thought about doing:

with cte_1
as
(
select * from table
where clinic_1 = clinic_2
)
,cte_2
as
(
select row_number () over (Partition by clinic_1, individual, month order by clinic_1, individual, month, address_recorded desc) as number, *
from cte_1
)
select individual, clinic_1, month, address_code from cte_2 where number = 1

But I don't know how to get those individual-clinic_1-month for which there are no instances where clinic_1=clinic_2, any ideas?

CodePudding user response：

You can Union two select queries; one to select all records where clinic_1=clinic_2 and another one to select all records where clinic_1<>clinic_2 and clinic_1 not in the results set of the first query.

Both queries are grouped by [individual],[clinic_1], [mnth] to find all of the required data rows for each [clinic_1] - [mnth] entry.

Check the following:

with cte as
(SELECT  [individual] ,[clinic_1],[clinic_2],[mnth],max([address_recorded]) as m
  FROM [MyData] where [clinic_1]=[clinic_2]
  group by [individual],[clinic_1],[clinic_2] ,[mnth] 
),
cte2 as
(SELECT  [MyData].[individual] ,[MyData].[clinic_1],'' as [clinic_2],[MyData].[mnth],max([MyData].[address_recorded]) as m
  FROM [MyData] 
  Left Join cte on cte.individual=MyData.individual
  and cte.mnth=MyData.mnth
  where [MyData].[clinic_1]<>[MyData].[clinic_2] and cte.individual IS NULL
  group by [MyData].[individual],[MyData].[clinic_1], [MyData].[mnth] 
),
D as
(SELECT * FROM cte
UNION
SELECT * FROM cte2)
, LastQr as(
select [MyData].individual, [MyData].clinic_1,[MyData].mnth,[MyData].address_code from D
INNER JOIN [MyData]
ON D.individual=MyData.individual and D.clinic_1=MyData.clinic_1 and D.mnth=MyData.mnth and D.m=MyData.address_recorded
and D.clinic_2=MyData.clinic_2
UNION
select [MyData].individual, [MyData].clinic_1,[MyData].mnth,[MyData].address_code from D
INNER JOIN [MyData]
ON D.individual=MyData.individual and D.clinic_1=MyData.clinic_1 and D.mnth=MyData.mnth and D.m=MyData.address_recorded
and D.clinic_2='')
,
SelFirstFromDublicate as
(select LastQr.individual, LastQr.clinic_1,LastQr.mnth,LastQr.address_code, 
row_number() OVER(PARTITION BY LastQr.individual, LastQr.clinic_1,LastQr.mnth ORDER BY LastQr.individual, LastQr.clinic_1,LastQr.mnth)
as rn from LastQr) 

select * from SelFirstFromDublicate where rn=1

See the results from dbfiddle.uk.