Partition by, dense rank-CodePudding

I have the following table describing patients visits: Each patient has a visit_id for which he sees a given physician. I am trying to extract the value of the visit_id for which saw his 3rd physician. (3rd physician and not 3rd visit)

patient	visit_id	physician
a	1	id_1
a	2	id_2
a	3	id_1
a	4	id_3
b	5	id_1
b	6	id_2
c	7	id_1
c	8	id_2
c	9	id_3

so the result would be:

patient	visit_id
a	4
c	9

Any suggestions?

CodePudding user response：

The following statement returns your result. The innermost subquery eliminates multiple visits to the same physician, then row_number() counts the visits and the outermost select gets the third physician.

select patient, visit 
  from (select patient, visit, row_number() over (partition by patient order by visit) rn
          from  (  select patient, min(visit) as visit 
                     from tab 
                 group by patient, physician
                ) t1
       ) t2
 where t2.rn = 3

Result:

patient	visit_id
a	4
c	9

See db<>fiddle

CodePudding user response：

You can group by patient and physician to remove "duplicate" physicians and use min for visit_id:

-- test data
WITH dataset (patient, visit_id, physician) AS (
    VALUES ('a', 1, 'id_1'),
        ('a', 2, 'id_2'),
        ('a', 3, 'id_1'),
        ('a', 4, 'id_3'),
        ('b', 5, 'id_1'),
        ('b', 6, 'id_2'),
        ('c', 7, 'id_1'),
        ('c', 8, 'id_2'),
        ('c', 9, 'id_3')
)

-- query
select patient, visit_id
from (
        select *,
            row_number() over (partition by patient order by visit_id) rnk
        from (
                select patient,
                    min(visit_id) visit_id,
                    physician
                from dataset
                group by patient, physician
            )
    )
where rnk = 3

Output:

patient	visit_id
a	4
c	9

Note that this query uses presto syntax (as your question has presto tag).