Home > database >  Group by sum and transpose in pandas
Group by sum and transpose in pandas

Time:06-23

I have a data frame as shown below.

id     type           d_count    d_amt    c_count    c_amt
1      restaurant     1          10       NaN        NaN
1      restaurant     3          8        NaN        NaN
1      fees           NaN        NaN      2          100
1      health         1          20       NaN        NaN
1      restaurant     NaN        NaN      2          15
1      restaurant     1          10       NaN        NaN
1      restaurant     NaN        NaN      1          3
2      wage           NaN        NaN      1          10
2      wage           NaN        NaN      3          30
2      fees           NaN        NaN      2          100
2      health         1          20       NaN        NaN
2      fees           NaN        NaN      2          15
2      fees           1          10       NaN        NaN
2      fees           1          3        NaN        NaN

From the above I would like to prepare below data frames.

Explanation: For each customer calculate their different type total amounts

Expected output:

df1:

id   restaurant_d_amt_sum   health_d_amt_sum   fees_d_amt_sum
1    28                     20                 NaN
2    NaN                    20                 13

df2:

id    fees_c_amt_sum      restaurant_c_amt_sum      wage_c_amt_sum
1     100                 18                        NaN
2     115                 NaN                       40

CodePudding user response:

df.groupby(['id','type'])['d_amt'].sum().unstack().add_suffix('_d_amt_sum').reset_index()
type id     fees_d_amt_sum  health_d_amt_sum    restaurant_d_amt_sum    wage_d_amt_sum
0    1            0.0             20.0            28.0                      NaN
1    2            13.0            20.0            NaN                       0.0
df.groupby(['id','type'])['c_amt'].sum().unstack().add_suffix('_c_amt_sum').reset_index()
type id     fees_c_amt_sum  health_c_amt_sum    restaurant_c_amt_sum    wage_c_amt_sum
0     1            100.0              0.0             18.0                      NaN
1     2            115.0              0.0             NaN                       40.0

CodePudding user response:

df.pivot_table(columns='type', index='id', aggfunc='sum')

Output:

      c_amt                         c_count                        d_amt                        d_count
type   fees health restaurant  wage    fees health restaurant wage  fees health restaurant wage    fees health restaurant wage
id
1     100.0    0.0       18.0   NaN     2.0    0.0        3.0  NaN   0.0   20.0       28.0  NaN     0.0    1.0        5.0  NaN
2     115.0    0.0        NaN  40.0     4.0    0.0        NaN  4.0  13.0   20.0        NaN  0.0     2.0    1.0        NaN  0.0
  • Related