I have a Dataframe built in this way:
year a b c d e
2020 True False True False True
2020 False False True False True
2021 False False True False True
and I'd like to have generate a Dataframe of the type:
year col count_true count_false
2020 a 1 1
2020 b 0 2
...
2021 a 0 1
How could i do that? I'm pretty sure I should use pivoting but I'm unable to find the right way.
CodePudding user response:
You could melt it first in order to unpivot, then reconstruct the pivot table in the way you want.
import pandas as pd
df = pd.DataFrame({'year': {0: 2020, 1: 2020, 2: 2021},
'a': {0: True, 1: False, 2: False},
'b': {0: False, 1: False, 2: False},
'c': {0: True, 1: True, 2: True},
'd': {0: False, 1: False, 2: False},
'e': {0: True, 1: True, 2: True}})
df.melt(id_vars='year',
var_name='col').pivot_table(index=['year','col'],
columns='value',
aggfunc=lambda x:len(x), fill_value=0).add_prefix('count_')
Output
value count_False count_True
year col
2020 a 1 1
b 2 0
c 0 2
d 2 0
e 0 2
2021 a 1 0
b 1 0
c 0 1
d 1 0
e 0 1
CodePudding user response:
You need to melt
(=unpivot) and groupby.agg
:
(df
.melt(id_vars='year')
.groupby(['year', 'variable'], as_index=False)
.agg(**{'count_true': ('value', 'sum'),
'count_false': ('value', lambda x: (~x).sum()),
})
)
output:
year variable count_true count_false
0 2020 a 1 1
1 2020 b 0 2
2 2020 c 2 0
3 2020 d 0 2
4 2020 e 2 0
5 2021 a 0 1
6 2021 b 0 1
7 2021 c 1 0
8 2021 d 0 1
9 2021 e 1 0