I am trying to grab a record that meets multiple conditions. See below.
DATA:
PERSON_ID | DOCTOR | DATE_OF_SERVICE |
---|---|---|
1234 | Dr. Smith | 2022-01-01 |
1234 | 2022-01-01 | |
1234 | Dr. Jane | 2022-03-01 |
1234 | 2022-06-01 |
DESIRED OUTPUT:
ID | DOCTOR | DATE_OF_SERVICE |
---|---|---|
1234 | Dr. Smith | 2022-01-01 |
1234 | Dr. Jane | 2022-03-01 |
1234 | 2022-06-01 |
Basically, if a person_id has the same date_of_service but one record has a doctor populated and the other doesn't, take the record where the doctor is not null.
BUT - if there is only one record where there is no doctor listed, then it is okay to keep.
Is this doable? Any help would be greatly helpful
CodePudding user response:
This is a good use for the qualify
clause in Snowflake:
create or replace table T1 as
select
COLUMN1::string as "PERSON_ID",
COLUMN2::string as "DOCTOR",
COLUMN3::string as "DATE_OF_SERVICE"
from (values
('1234','Dr. Smith','2022-01-01'),
('1234',null,'2022-01-01'),
('1234','Dr. Jane','2022-03-01'),
('1234',null,'2022-06-01')
);
select * from T1
qualify (count(*)) over (partition by person_id, date_of_service) = 1
or doctor is not null;
PERSON_ID | DOCTOR | DATE_OF_SERVICE |
---|---|---|
1234 | Dr. Smith | 2022-01-01 |
1234 | Dr. Jane | 2022-03-01 |
1234 | null | 2022-06-01 |
The qualify clause works like a where clause for window functions. The count(*) over...
part "qualifies" rows where there's only one row for a specific person_id and date_of_service. The doctor is not null
part is self-evident and allows those rows to qualify.
CodePudding user response:
Use NOT EXISTS
:
SELECT t1.*
FROM tablename t1
WHERE t1.doctor IS NOT NULL
OR NOT EXISTS (
SELECT *
FROM tablename t2
WHERE t2.person_id = t1.person_id
AND t2.date_of_service = t1.date_of_service
AND t2.doctor IS NOT NULL
);
See the demo.