I am using .apply(pd.Series.value_counts, axis=0)
to count the values in two pandas columns ['a','b'].
However when I try and use it after grouping on column 'Group', I get the error:
TypeError: value_counts() got an unexpected keyword argument 'axis'
It works when grouping in a for loop, but not with a groupby apply.
Here is code with a working groupby in for loop, and with the groupby.apply not working:
import pandas as pd
import numpy as np
#example dataframe
df = pd.DataFrame(
{
"a": [1, 1, 2, 3, 3, 4, 5, 1, 1, 1, 4, 4, 4, 5, 6, 6, 6, 6, 3],
'b': [3, 4, 5, 5, 5, 2, 1, 3, 4, 4, 4, 5, 6, 6, 4, 3, 6, 6, 3],
"Group": ['g1', 'g1', 'g1', 'g2', 'g2', 'g1', 'g2', 'g1', 'g1', 'g2','g2', 'g2', 'g2', 'g2', 'g1','g1', 'g2', 'g2', 'g2'],
}
)
#grouping and applying with for loop.
lst = []
for key, grp in df.groupby('Group'):
df_ = grp[['a','b']].apply(pd.Series.value_counts, axis=0)
df_['Group']=key
lst.append(df_)
print ('this works', pd.concat(lst), sep='\n')
# with df.groupby it doesn't work.
df.groupby('Group')[['a','b']].apply(pd.Series.value_counts, axis=0)
OUTPUT, with the expected result from the for loop
a b Group
1 4.0 NaN g1
2 1.0 1.0 g1
3 NaN 3.0 g1
4 1.0 3.0 g1
5 NaN 1.0 g1
6 2.0 NaN g1
1 1.0 1.0 g2
3 3.0 1.0 g2
4 3.0 2.0 g2
5 2.0 3.0 g2
6 2.0 4.0 g2
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs)
1274 try:
-> 1275 result = self._python_apply_general(f, self._selected_obj)
1276 except TypeError:
11 frames
TypeError: value_counts() got an unexpected keyword argument 'axis'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/groupby/groupby.py in f(g)
1257 def f(g):
1258 with np.errstate(all="ignore"):
-> 1259 return func(g, *args, **kwargs)
1260
1261 elif hasattr(nanops, "nan" func):
TypeError: value_counts() got an unexpected keyword argument 'axis'
CodePudding user response:
Here's what happened.
In a for-loop you were applying pd.Series.value_counts
to the DataFrame
. In this case the method apply
has a parameter axis
.
In the second case you have a different method apply
of DataFrameGroupBy
instance. This method has a different signature. It accepts the function as a first parameter and all other parameters are used as additional parameters of this function. So axis
goes to pd.Series.value_counts
. As far as Series.value_counts
has no axis
parameter in its signature, you got an error.
(
df
.groupby('Group')[['a','b']]
.apply(lambda x: x.apply(pd.Series.value_counts, axis=0))
.fillna(0)
.astype(int)
)
P.S.
See also GroupBy.apply vs DataFrame.apply