Home > Software design >  Pandas - Rename columns by removing text before delimiter
Pandas - Rename columns by removing text before delimiter

Time:09-13

Given below are the column names of my Dataframe

['user_id', 'Week 36~Sep-05 - Sep-11',
 'Week 35~Aug-29 - Sep-04', 'Week 34~Aug-22 - Aug-28']

I would like remove the text before the delimiter (~) if it is there in the column label and get the below column names

['user_id', 'Sep-05 - Sep-11', 'Aug-29 - Sep-04', 'Aug-22 - Aug-28']

I tried the below but it failed

[col.split('~')[1] for col in df.columns]

Error : IndexError: list index out of range

CodePudding user response:

I would use str.replace here:

df.columns = df.columns.str.replace(r'.*~', '', regex=True)

output:

Index(['user_id', 'Sep-05 - Sep-11', 'Aug-29 - Sep-04', 'Aug-22 - Aug-28'], dtype='object')

input:

Index(['user_id', 'Week 36~Sep-05 - Sep-11', 'Week 35~Aug-29 - Sep-04',
       'Week 34~Aug-22 - Aug-28'],
      dtype='object')

You approach would work with -1 indexing:

df.columns = [col.split('~')[-1] for col in df.columns]

CodePudding user response:

It is happening because you are trying to split user_id using ~ and there is nothing at index 1 Try this

df.columns = [col.split('~')[1] if col.startswith('Week') else col  for col in df.columns]
  • Related