Home > Back-end >  Counting values greater than each value in a Pandas series following groupby
Counting values greater than each value in a Pandas series following groupby

Time:06-16

I am trying to find an elegant way to return a series that for each element in some dataframe column, is the count of instances for the column overall which are greater than/less than that element following some grouping. For instance, the overall data frame would look something like:

group values
a 1
a 2
a 3
b 4
b 5
b 6

And the output would be something like:

group values Count within group
a 1 0
a 2 1
a 3 2
b 4 0
b 5 1
b 6 2

Somehow, this feels like it ought to be simple to do - in Excel it would be a fairly straightforward Countifs() function - but for the life of me I can't figure out how to do it in an elegant way in Pandas. The closest I seem to be able to get is with the pandas Series methods .ge(), .le(), etc. Like, a count of the number of elements less than each element of a series can be found with something like:

series.apply(lambda x: sum(series.lt(x)))

But I can't figure out how to apply this logic to each group in the dataframe without breaking the dataframe up working on each group separately. It seems to me there should be someway to apply this to a groupyby - something like:

df.groupby('group').transform(lambda x: sum(x.lt(x)))

I can't figure out how to pass both the series and each element in the series to the lambda function.

CodePudding user response:

df = pd.DataFrame({'group': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'a', 5: 'b', 6: 'b', 7: 'b', 8: 'b', 9: 'b'}, 'value': {0: 2, 1: 4, 2: 2, 3: 3, 4: 5, 5: 1, 6: 2, 7: 4, 8: 1, 9: 5}})
# You may want other methods of rank, but it's not clear from your question.
df['count_in_group'] = df.groupby('group').rank('min').sub(1)
print(df.sort_values(['group', 'value']))
...

  group  value  count_in_group
0     a      2             0.0
2     a      2             0.0
3     a      3             2.0
1     a      4             3.0
4     a      5             4.0
5     b      1             0.0
8     b      1             0.0
6     b      2             2.0
7     b      4             3.0
9     b      5             4.0
  • Related