I have the DataFrame below and want to find the count of y
and n
for each column:
ID | var1 | var2 |
---|---|---|
1 | y | |
2 | n | y |
3 | y | n |
4 | y | n |
5 | y |
the result would be like this:
var1_N | var2_N | |
---|---|---|
y | 3 | 2 |
n | 1 | 2 |
I used transform
function but was wondering there is a better way to get the results.
Thanks!
CodePudding user response:
You can just do value_counts
on all columns you need to count using apply
method, the results will be automatically joined on the index (or var value in your case):
df.filter(like='var').apply(lambda s: s.value_counts())
var1 var2
y 3 2
n 1 2
Or use pd.value_counts
directly:
df.filter(like='var').apply(pd.value_counts)
var1 var2
y 3 2
n 1 2
CodePudding user response:
You can use melt
to flatten your dataframe then use value_counts
and unstack
the variable column:
>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable')
variable var1 var2
value
n 1 2
y 3 2
You can remove index and column names: by appending ``
Output:
>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable') \
.rename_axis(index=None, columns=None)
var1 var2
n 1 2
y 3 2