group by inside group by?-CodePudding

let's say I have emp_no, year_that_got_paid, gender, amount_paid I want to group by year_that_got_paid and gender, so that I can see per year (1990, 1991...) how many ppl of each gender got paid. However, if the same emp_no appears twice in the same year, it counts it as one more person with that gender, but it's actually the same.

Notes: I simplified the table for the question but it actually has more columns and primary key doesn't help to prevent the duplicate employee IDs for that year.

Therefore, I thought, maybe a kind of

select count(*) 
from tableX 
group by year_that_got_paid, gender

but then I would need a second group by or some kind of "distinct" to count only once every emp_no, year_that_got_paid. If I group by the 3 fields it doesn't work either as it would greate a lot of rows saying 1, 2 or whatever amount of times the employee received salary that year.

CodePudding user response：

Just do a COUNT DISTINCT on the employee id

CodePudding user response：

Your table stores money people earned. The people can switch genders, so you store the gender along. And they can earn multiple amounts in a year.

So, if Bobby (Roberta / Robert) earned 1000$ as a woman, 500$ as a man, and 500$ as a woman again in 2020, you want this to count as 1 man in 2020 (having earned 500$) and one woman (having earned 1500$).

In order to achieve this, aggregate by employee number and gender in a year first:

select
  year_that_got_paid,
  count(case when gender = 'male' then 1 end) as male_count,
  count(case when gender = 'female' then 1 end) as female_count,
  sum(case when gender = 'male' then sum_amount_paid else 0 end) as male_paid,
  sum(case when gender = 'female' then sum_amount_paid else 0 end) as female_paid
from
(
  select year_that_got_paid, emp_id, gender, sum(amount_paid) as sum_amount_paid
  from transgender_payments
  group by year_that_got_paid, emp_id, gender
) per_emp_and_their_gender
group by year_that_got_paid
order by year_that_got_paid;