Home > Blockchain >  How do I calculate percentile based on frequency of occurrence in a postgres table?
How do I calculate percentile based on frequency of occurrence in a postgres table?

Time:12-17

I have a table called votes. It includes three columns: voter_id, candidate_id, and is_citizen (bool).

Voters can vote many times. Each time a voter votes for a candidate, it adds an entry into the table, populating the voter_id, the candidate_id, and is_citizen indicating whether or not the voter is a citizen. Voters may appear many times in the table, if they voted for many candidates. Candidates may appear many times in the table if many people voted for them. Each voter_candidate pairing must be unique.

Given a candidate_id, I want to figure out what percent rank that candidate is based on how many times they appear in the table. For example, let's say we have three total candidates: candidate_ids: 1, 2, and 3. Candidate 1 got 5 votes from citizens, candidate 2 got 7, and candidate 3 got 20. Note: I don't want to factor in any votes from noncitizens.

Given candidate_id 2, it should return .5, as candidate 2 was in the 50th percentile by frequency of occurrence (not by total number of votes)

I've been workshopping it, and this is as far as I've gotten, but it's still giving me errors :(

SELECT
  candidate_id,
  PERCENT_RANK() WITHIN GROUP (ORDER BY COUNT(*) DESC)
FROM votes
GROUP BY candidate_id
HAVING candidate_id = <candidate_id>;

CodePudding user response:

with candidate_rank as (
select candidate_id,
       percent_rank() over (order by count(*) desc) pct_rank
  from votes
 where is_citizen = 1
 group by candidate_id)
select candidate_id, pct_rank
  from candidate_rank
 where candidate_id = <candidate_id>;
  • Related