Home > other >  pandas rename columns with method chaining
pandas rename columns with method chaining

Time:10-22

I have a dataframe and did some feature engineering and now would like to change the column names. I know how to change them if I do a new assignment but I would like to do it with method chaining. I tried the below (the rename row) but it doesn't work. How could I write it so it works?

df = pd.DataFrame({'ID':[1,2,2,3,3,3], 'date': ['2021-10-12','2021-10-16','2021-10-15','2021-10-10','2021-10-19','2021-10-01'], 
                   'location':['up','up','down','up','up','down'], 
                   'code':[False, False, False, True, False, False]})

df = (df
     .assign(date = lambda x: pd.to_datetime(x.date))
     .assign(entries_per_ID = lambda x: x.groupby('ID').ID.transform('size'))
     .pivot_table(values=['entries_per_ID'], index=['ID','date','code'],
                   columns=['location'], aggfunc=np.max)
     .reset_index()
     #.rename(columns=lambda x: dict(zip(x.columns, ['_'.join(col).strip() if col[1]!='' else col[0] for col in x.columns.values])))
     )

This here works, but that's not how I would like to write it.

df.columns = ['_'.join(col).strip() if col[1]!='' else col[0] for col in df.columns.values ]

CodePudding user response:

To set df.columns in a chain, use set_axis(..., axis=1):

df.set_axis(['_'.join(col).strip() if col[1] else col[0] for col in df.columns], axis=1)

In this case, set_axis needs the result of the pipeline, so pipe it:

df = (df
     .assign(date = lambda x: pd.to_datetime(x.date))
     .assign(entries_per_ID = lambda x: x.groupby('ID').ID.transform('size'))
     .pivot_table(values=['entries_per_ID'], index=['ID','date','code'],
                   columns=['location'], aggfunc=np.max)
     .reset_index()
     .pipe(lambda x: x.set_axis(['_'.join(col).strip() if col[1] else col[0] for col in x.columns], axis=1))
     )

#    ID       date   code  entries_per_ID_down  entries_per_ID_up
# 0   1 2021-10-12  False                  NaN                1.0
# 1   2 2021-10-15  False                  2.0                NaN
# 2   2 2021-10-16  False                  NaN                2.0
# 3   3 2021-10-01  False                  3.0                NaN
# 4   3 2021-10-10   True                  NaN                3.0
# 5   3 2021-10-19  False                  NaN                3.0
  • Related