The only columns I want are
COUNT(DISTINCT PATIENT_ID)
YEAR
IN or Out
How do I define "Name" which is columns FIRST_NAME LAST_NAME
I have a bunch of names that I need to compare to a data set on snowflake and separate them as IN if that match or OUT if they dont
SELECT
COUNT(DISTINCT PATIENT_ID),
FIRST_NAME,
LAST_NAME,
CONCAT(FIRST_NAME,' ', LAST_NAME) as Name,
(CASE WHEN (
(Name LIKE '%First Name1%')
OR (Name LIKE '%First Name2%')
OR (Name LIKE '%First Name3%')
OR (Name LIKE '%First Name4%')
)
THEN 'IN'
ELSE 'OUT' END) as "IN or Out",
YEAR(SERVICE_DATE) Year
from "dataset" dx
WHERE
(dx.primary_specialty_desc like ('%GASTRO%')
AND dx.diagnosis_code in ('K7581' , 'K740' , 'K760' , 'K746')
AND dx.SERVICE_DATE >= '2017-01-01')
or ( dx.primary_specialty_desc like('%HEP%')
AND dx.diagnosis_code in ('K7581' , 'K740' , 'K760' , 'K746')
AND dx.SERVICE_DATE >= '2017-01-01')
GROUP BY
Year,
FIRST_NAME,
LAST_NAME,
SERVICE_DATE,
(CASE WHEN (
(Name LIKE '%First Name1%')
OR (Name LIKE '%First Name2%')
OR (Name LIKE '%First Name3%')
OR (Name LIKE '%First Name4%')
)
THEN 'IN'
ELSE 'OUT' END)
CodePudding user response:
It will be better to use subquery in this case like
SELECT COUNT(DISTINCT PATIENT_ID),
FIRST_NAME,
LAST_NAME,
(CASE WHEN (
(Name LIKE '%First Name1%')
OR (Name LIKE '%First Name2%')
OR (Name LIKE '%First Name3%')
OR (Name LIKE '%First Name4%')
THEN 'IN'
ELSE 'OUT' END as IN_OR_OUT)
), Year FROM (
SELECT
PATIENT_ID,
FIRST_NAME,
LAST_NAME,
CONCAT(FIRST_NAME,' ', LAST_NAME) as Name,
YEAR(SERVICE_DATE) Year
from "dataset" dx
WHERE
(dx.primary_specialty_desc like ('%GASTRO%')
AND dx.diagnosis_code in ('K7581' , 'K740' , 'K760' , 'K746')
AND dx.SERVICE_DATE >= '2017-01-01')
or ( dx.primary_specialty_desc like('%HEP%')
AND dx.diagnosis_code in ('K7581' , 'K740' , 'K760' , 'K746')
AND dx.SERVICE_DATE >= '2017-01-01')
) AS t
GROUP BY
FIRST_NAME,
LAST_NAME,
Year,
IN_OR_OUT
Just check the typo. You can check the performance with EXPLAIN plan and decide
CodePudding user response:
You have a kind of transaction table where the patient_id is stored along with the first and last names. This looks redundant on first gance, but we know that names can change, when people get married for instance, so this information may be needed.
Now, there may be patient #1, who is recorded as Ann Smith four times and later as Ann Miller five times. One name may be "IN", one may be "OUT". Or both are "IN" or both are "OUT". You just want to consider up to two entries (#1/'IN' and #1/'OUT') per year. So getting these distinct rows should be the first step in my opinion.
You can then use this data set to do your aggregation and count.
select service_year, in_or_out, count(*)
from
(
select distinct
year(service_date) as service_year,
patient_id,
case when concat(first_name,' ', last_name) like '%First Name1%'
or concat(first_name,' ', last_name) like '%First Name2%'
or concat(first_name,' ', last_name) like '%First Name3%'
or concat(first_name,' ', last_name) like '%First Name4%'
then 'IN' else 'OUT'
end as in_or_out
from "dataset"
where diagnosis_code in ('K7581' , 'K740' , 'K760' , 'K746')
and service_date >= date '2017-01-01'
and
(
primary_specialty_desc like '%GASTRO%'
or
primary_specialty_desc like '%HEP%'
)
) distinct_data_set
group by service_year, in_or_out
order by service_year, in_or_out;