Home > Back-end >  How to convert the dummy variable columns in to several columns?
How to convert the dummy variable columns in to several columns?

Time:04-22

I know how to unstack rows into columns, but how to deal with the following table?

date dummy avg lable
1-19 1 20 l1
1-19 0 40 l1
1-27 1 100 l2
1-27 0 140 l2

Expecting tables:

date avg_t avg_c lable
1-19 20 40 l1
1-27 100 140 l2

The "avg" is 20 when "dummy" equals to 1 and it is renamed to "avg_t" as a column. Similar as the column "avg_c".

I tried:

df.groupby(['dummy','avg']).size().unstack

It does not work.

CodePudding user response:

You could also use pivot_wider from janitor:.

# pip install pyjanitor
import pandas as pd
import janitor

df.pivot_wider(index = ['date', 'lable'], names_from = 'dummy')

Out[19]: 
   date lable  avg_0  avg_1
0  1-19    l1     40     20
1  1-27    l2    140    100

if you want the t and c:

(df.assign(dummy = df['dummy'].map({0:'c', 1:'t'})).
    pivot_wider(index = ['date', 'lable'], names_from = 'dummy'))

Out[20]: 
   date lable  avg_c  avg_t
0  1-19    l1     40     20
1  1-27    l2    140    100

CodePudding user response:

Create new column by DataFrame.assign with Series.map and then use DataFrame.pivot with DataFrame.add_prefix:

df = (df.assign(new = df['dummy'].map({0:'c', 1:'t'}))
        .pivot(['date','lable'], 'new', 'avg')
        .add_prefix('avg_')
        .reset_index()
        .rename_axis(None, axis=1))

print (df)
   date lable  avg_c  avg_t
0  1-19    l1     40     20
1  1-27    l2    140    100
  • Related