Home > other >  value_counts not working in groupby apply
value_counts not working in groupby apply

Time:08-25

I am using .apply(pd.Series.value_counts, axis=0) to count the values in two pandas columns ['a','b'].

However when I try and use it after grouping on column 'Group', I get the error:

TypeError: value_counts() got an unexpected keyword argument 'axis'

It works when grouping in a for loop, but not with a groupby apply.

Here is code with a working groupby in for loop, and with the groupby.apply not working:

import pandas as pd
import numpy as np

 #example dataframe 
df = pd.DataFrame(
    {
        "a": [1, 1, 2, 3, 3, 4, 5, 1, 1, 1, 4, 4, 4, 5, 6, 6, 6, 6, 3],
        'b': [3, 4, 5, 5, 5, 2, 1, 3, 4, 4, 4, 5, 6, 6, 4, 3, 6, 6, 3],
        "Group": ['g1', 'g1', 'g1', 'g2', 'g2', 'g1', 'g2', 'g1', 'g1', 'g2','g2', 'g2', 'g2', 'g2', 'g1','g1', 'g2', 'g2', 'g2'],
    }
)


 #grouping and applying with for loop. 
lst = []
for key, grp in df.groupby('Group'): 
  df_ = grp[['a','b']].apply(pd.Series.value_counts, axis=0)
  df_['Group']=key
  lst.append(df_)
print ('this works', pd.concat(lst), sep='\n')

 # with df.groupby it doesn't work. 
df.groupby('Group')[['a','b']].apply(pd.Series.value_counts,  axis=0)

OUTPUT, with the expected result from the for loop

     a    b Group
1  4.0  NaN    g1
2  1.0  1.0    g1
3  NaN  3.0    g1
4  1.0  3.0    g1
5  NaN  1.0    g1
6  2.0  NaN    g1
1  1.0  1.0    g2
3  3.0  1.0    g2
4  3.0  2.0    g2
5  2.0  3.0    g2
6  2.0  4.0    g2

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs)
   1274             try:
-> 1275                 result = self._python_apply_general(f, self._selected_obj)
   1276             except TypeError:

11 frames
TypeError: value_counts() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/groupby/groupby.py in f(g)
   1257                 def f(g):
   1258                     with np.errstate(all="ignore"):
-> 1259                         return func(g, *args, **kwargs)
   1260 
   1261             elif hasattr(nanops, "nan"   func):

TypeError: value_counts() got an unexpected keyword argument 'axis'


CodePudding user response:

Here's what happened.

In a for-loop you were applying pd.Series.value_counts to the DataFrame. In this case the method apply has a parameter axis.

In the second case you have a different method apply of DataFrameGroupBy instance. This method has a different signature. It accepts the function as a first parameter and all other parameters are used as additional parameters of this function. So axis goes to pd.Series.value_counts. As far as Series.value_counts has no axis parameter in its signature, you got an error.

(
    df
    .groupby('Group')[['a','b']]
    .apply(lambda x: x.apply(pd.Series.value_counts, axis=0))
    .fillna(0)
    .astype(int)
)

P.S.
See also GroupBy.apply vs DataFrame.apply

  • Related