Home > Mobile >  Pandas groupby multiple columns exclusively
Pandas groupby multiple columns exclusively

Time:02-17

I have the DataFrame below and want to find the count of y and n for each column:

ID var1 var2
1 y
2 n y
3 y n
4 y n
5 y

the result would be like this:

var1_N var2_N
y 3 2
n 1 2

I used transform function but was wondering there is a better way to get the results. Thanks!

CodePudding user response:

You can just do value_counts on all columns you need to count using apply method, the results will be automatically joined on the index (or var value in your case):

df.filter(like='var').apply(lambda s: s.value_counts())

   var1  var2
y     3     2
n     1     2

Or use pd.value_counts directly:

df.filter(like='var').apply(pd.value_counts)

   var1  var2
y     3     2
n     1     2

CodePudding user response:

You can use melt to flatten your dataframe then use value_counts and unstack the variable column:

>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable')

variable  var1  var2
value               
n            1     2
y            3     2

You can remove index and column names: by appending ``

Output:

>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable') \
      .rename_axis(index=None, columns=None)

   var1  var2
n     1     2
y     3     2
  • Related