I created the following snippet to let you know the issue I have
The query is used to get all the rows where I have the most recent day
, study_id
and status
of a candidate from that table.
As you see from the example I'm getting only 2 rows but I should have more also including the rows having site_id === null
.
I should have included also the row
('2020-01-02 00:00:00 00', 'TBX1', NULL, 'INCOMPLETE', 120, 100)
The final result I'm expecting
study_id | site_id | status | day | day | study_id | site_id | status | total | current |
---|---|---|---|---|---|---|---|---|---|
TBX1 | SWE-1 | INCOMPLETE | 2020-01-02 00:00:00 00 | 2020-01-02 00:00:00 00 | TBX1 | SWE-1 | INCOMPLETE | 37 | 31 |
TBX1 | SWE-2 | INCOMPLETE | 2020-01-02 00:00:00 00 | 2020-01-02 00:00:00 00 | TBX1 | SWE-2 | INCOMPLETE | 43 | 41 |
TBX1 | null | INCOMPLETE | 2020-01-02 00:00:00 00 | 2020-01-02 00:00:00 00 | TBX1 | null | INCOMPLETE | 120 | 100 |
So, therefore, is included the row also wth site_id null
CodePudding user response:
Is it "take last from each group" problem ?
select distinct on (study_id, site_id,status)
candidates.* from candidates
WHERE
"study_id" in('TBX1')
AND "status" in('INCOMPLETE')
order by study_id, site_id, status , day desc
CodePudding user response:
You can't directly compare NULL. NULL = NULL check using = operator doesn't yield TRUE or FALSE but NULL.
Instead you can check if they are both NULL. ie:
SELECT
--sum("current")
*
FROM (
SELECT
"study_id",
"site_id",
"status",
max("day") AS "day"
FROM
"candidates"
WHERE
"study_id" in('TBX1')
AND "status" in('INCOMPLETE')
GROUP BY
"study_id",
"site_id",
"status") AS "latest"
INNER JOIN "candidates" ON "latest"."day" = "candidates"."day"
AND "latest"."study_id" = "candidates"."study_id"
AND ("latest"."site_id" = "candidates"."site_id" or
("latest"."site_id" is null and "candidates"."site_id" is null))
AND "latest"."status" = "candidates"."status";