Home > OS >  how to group by dataframe and move categories to columns
how to group by dataframe and move categories to columns

Time:06-22

lst = [
 ['s001','b1','typeA'],['s002','b1','typeB'],['s003','b1','typeC'],['s004','b1','typeD'],
 ['s005','b1','typeA'],['s006','b1','typeB'],['s007','b1','typeC'],['s008','b1','typeD'],
 ['s009','b2','typeA'],['s010','b2','typeB'],['s011','b2','typeC']
 ] 


df=pd.DataFrame(lst,columns=['sn','setting','status'])
      sn setting status
0   s001      b1  typeA
1   s002      b1  typeB
2   s003      b1  typeC
3   s004      b1  typeD
4   s005      b1  typeA
5   s006      b1  typeB
6   s007      b1  typeC
7   s008      b1  typeD
8   s009      b2  typeA
9   s010      b2  typeB
10  s011      b2  typeC

(each row on sn is unique)

I can use group by to get the info.

df.groupby(['setting','status']).size().reset_index()
setting status  0
0      b1  typeA  2
1      b1  typeB  2
2      b1  typeC  2
3      b1  typeD  2
4      b2  typeA  1
5      b2  typeB  1
6      b2  typeC  1

But I prefer to group them by setting column and count total & each status number, like bellow format:

setting total tppeA typeB  typeC typeD
b1       8     2     2      2    2 
b2       3     1     1      1    0

(typeA to typeD are known type names, but a given dataset would not always have all those 4 unique types in it).

But I don't know how to convert them to columns (for total column, I can plus 4 types status)

CodePudding user response:

Let us do

out = pd.crosstab(df.setting,df.status,margins = True,margins_name = 'Total').drop(['Total']) # reset_index()
Out[97]: 
status   typeA  typeB  typeC  typeD  Total
setting                                   
b1           2      2      2      2      8
b2           1      1      1      0      3

CodePudding user response:

use pivot

df2=df.groupby(['setting','status']).size().reset_index().pivot_table(index='setting', columns='status', values=0).fillna(0)
df2['total'] = df2.sum(axis=1)
df2

status  typeA   typeB   typeC   typeD   total
setting                     
b1      2.0     2.0     2.0     2.0     8.0
b2      1.0     1.0     1.0     0.0     3.0
  • Related