Home > Software design >  count, filter, over together
count, filter, over together

Time:12-17

Can you please tell me how can I get following result?

Table I have is:

create table as tbl 
(
event_id integer, 
payor varchar
)

payor and event_id both have duplicates. an event may contain more than 2 payor as well.

What I need is:
event_id column,
payor column,
a column that shows count of those events that are funded by more than 1 funder.

I want to keep the all the events in the result. So first the query should do count of event_id by partition by event_id then look at that partitioned set and see whether the payor is different in that set for event_id and then give the result as count those events that have more than 1 payor.

below first 2 columns are part of my table and 3rd columns is what i want For e.g.

event_id payor count how count is calculated
2323 a 2 count is 2 because 2323 has 2 unique payors a and b
2323 b 2 count is 2 because 2323 has 2 unique payors a and b
5432 b 1 count is 1 because 54332 has 1 unique payor b
3423 c 1 count is 1 because 3423 has 1 unique payor c
211 y 1 count is 1 because 211 has 1 unique payors y
211 y 1 count is 1 because 211 has 1 unique payors y
600 t 2 count is 2 because 600 has 2 unique payors t, o
600 0 2 count is 2 because 600 has 2 unique payors t, o
600 t 2 count is 2 because 600 has 2 unique payors t, o

I have something like this below, but I do not know how to write something to look into each partition set and the payor. Following does not work

select event_id, payor,
(count(event_id) over(partition by event_id order by event_id) filter (where (count(payor) over(partition by event_id order by event_id)) >2))
from tbl

any help is appreciated

CodePudding user response:

I think this could be what you are looking for:

SELECT tbl.event_id, tbl.payor, tbl_count.payor_count
FROM tbl
JOIN (
    SELECT DISTINCT ON (event_id, payor) event_id, payor, COUNT(*) payor_count
    FROM tbl
    GROUP BY event_id, payor
) tbl_count ON tbl_count.event_id = tbl.event_id AND tbl_count.payor = tbl.payor
WHERE tbl_count.payor_count > 1
  1. grouped by event_id and payor to count their unique pair quantity
  2. joined to same table so that events would not be lost, grouped
  3. added where condition to filter by quantity

CodePudding user response:

hi @Lemon and @Working pickle

eventually i was able to get what i need by using following query. i used dense_rank and paritioned that by event_id ordered by payor in cte then i got max of dense_rank for each event_id.

thanks for your help on this.

with cte as 
(
select 
event_id, 
payor,
dense_rank() over(partition by event_id order by payor) as drnk
FROM tbl
order by  1)
    
select event_id, payor,
    max(drnk) over(partition by event_id order by event_id)
FROM cte
  • Related