Home > front end >  insert column to df on sequenced location
insert column to df on sequenced location

Time:06-19

i have a df like this:

id month
1 1
1 3
1 4
1 6

i want to transform it become like this:

id 1 2 3 4 5 6
1 1 0 1 1 0 1

ive tried using this code:

ndf = df[['id']].join(pd.get_dummies(
    df['month'])).groupby('id').max()

but it shows like this:

id 1 3 4 6
1 1 1 1 1

how can i insert the middle column (2 and 5) even if it's not in the data?

CodePudding user response:

You can use pd.crosstab instead, then create new columns using pd.RangeIndex based on the min and max month, and finally use DataFrame.reindex (and optionally DataFrame.reset_index afterwards):

import pandas as pd

new_cols = pd.RangeIndex(df['month'].min(), df['month'].max())

res = (
    pd.crosstab(df['id'], df['month'])
     .reindex(columns=new_cols, fill_value=0)
     .reset_index()
)

Output:

>>> res

   id  1  2  3  4  5
0   1  1  0  1  1  0
  • Related