Write an SQL query to report the patient_id, patient_name all conditions of patients who have Type I Diabetes. Type I Diabetes always starts with DIAB1 prefix.
-------------- ---------
| Column Name | Type |
-------------- ---------
| patient_id | int |
| patient_name | varchar |
| conditions | varchar |
-------------- ---------
This table contains information of the patients in the hospital.
patient_id
is the primary key for this table. conditions
contains 0 or more code separated by spaces.
So this was my solution:
SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '%DIAB1%' ;
It worked correctly for all these conditions
patient_id | patient_name | conditions |
---|---|---|
1 | Daniel | YFEV COUGH |
2 | Alice | |
3 | Bob | DIAB100 MYOP |
4 | George | ACNE DIAB100 |
except for this condition
patient_id | patient_name | conditions |
---|---|---|
1 | Daniel | SADIAB100 |
And in the solution it was shown that there is a space after 1st % which would give you the correct answer:
correct query:
SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%' ;
So, can someone please explain why this query works for that particular condition (SADIAB100) and not the 1st query
CodePudding user response:
- creating a regex pattern to find the keyword 'DIAB1'
- filtering on cases where it matches the above
with main as (
select
*, REGEXP_LIKE(conditions,'DIAB1') as is_relevant_patient
from <table_name>
)
select * from main where is_relevant_patient
CodePudding user response:
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%'
The problem this is trying to address is when a condition contains the keyword (DIAB1) - while you only want to match on the beginning of the keyword.
The naive approach fails, because it matches on "SADIAB100":
WHERE conditions LIKE '%DIAB1%'
So the workaround is to search for the keyword:
- either at the beginning of the whole string (ie at the beginning of the first condition) ; that's what
LIKE 'DIAB1%'
does - or after another condition, in which case it is preceded by a space, so
' DIAB1%'
Hence:
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%'
A slightly neater expression is:
WHERE CONCAT(' ', conditions) LIKE '% DIAB1%'
Bottom line: if you are using a relational database, you should not be storing multiple values in a single row.
Instead of a CSV-like format, you should have a separate table to store the conditions, with each value on a separate row, allowing you to leverage the powerful set-based features that your product offers.