I have a standard dataframe like the one below :
Id Type Speed Efficiency Durability
0 Id001 A OK OK nonOK
1 Id002 A nonOK OK nonOK
2 Id003 B nonOK nonOK nonOK
3 Id004 B nonOK nonOK OK
4 Id005 A nonOK nonOK OK
5 Id006 A OK OK OK
6 Id007 A OK nonOK OK
7 Id008 B nonOK nonOK OK
8 Id009 C OK OK OK
9 Id010 B OK OK nonOK
10 Id011 C OK nonOK OK
11 Id012 C OK nonOK OK
12 Id013 C nonOK OK OK
13 Id014 C nonOK nonOK OK
14 Id015 C nonOK nonOK OK
And I'm trying to get this kind of output :
Type Test Speed Efficiency Durability
0 A OK 3 3 3
1 A nonOK 2 2 2
2 B OK 1 1 2
3 B nonOK 3 3 2
4 C OK 3 2 6
5 C nonOK 3 4 0
I tried with df.groupby('Type').agg('count')
but it doesn't give the expected output.
Is it possible to make this kind of transformation with pandas, please ?
CodePudding user response:
You can also use the following solution using pandas
method chaining:
import pandas as pd
(pd.melt(df, id_vars='Type', value_vars=['Speed', 'Efficiency', 'Durability'], value_name='Test')
.groupby(['Type', 'Test', 'variable'])
.size()
.reset_index()
.pivot(index=['Type', 'Test'], columns='variable', values=0)
.reset_index())
variable Type Test Durability Efficiency Speed
0 A OK 3.0 3.0 3.0
1 A nonOK 2.0 2.0 2.0
2 B OK 2.0 1.0 1.0
3 B nonOK 2.0 3.0 3.0
4 C OK 6.0 2.0 3.0
5 C nonOK NaN 4.0 3.0
CodePudding user response:
You could try as follows:
out = df.groupby('Type').agg({col:'value_counts' for col in df.columns[2:]})\
.fillna(0).astype(int).sort_index().reset_index().rename(
columns={'level_1':'Test'})
print(out)
Type Test Speed Efficiency Durability
0 A OK 3 3 3
1 A nonOK 2 2 2
2 B OK 1 1 2
3 B nonOK 3 3 2
4 C OK 3 2 6
5 C nonOK 3 4 0
CodePudding user response:
This should work, not sure if there's a simpler way:
li = []
for col in ['Speed', 'Efficiency', 'Durability']:
x = df.groupby('Type')[col].value_counts()
x.index = x.index.rename('Test', level=1)
li.append(x)
pd.concat(li, axis=1).fillna(0).astype(int)
Speed Efficiency Durability
Type Test
A OK 3 3 3
nonOK 2 2 2
B nonOK 3 3 2
OK 1 1 2
C OK 3 2 6
nonOK 3 4 0