Given below are the column names of my Dataframe
['user_id', 'Week 36~Sep-05 - Sep-11',
'Week 35~Aug-29 - Sep-04', 'Week 34~Aug-22 - Aug-28']
I would like remove the text before the delimiter (~) if it is there in the column label and get the below column names
['user_id', 'Sep-05 - Sep-11', 'Aug-29 - Sep-04', 'Aug-22 - Aug-28']
I tried the below but it failed
[col.split('~')[1] for col in df.columns]
Error : IndexError: list index out of range
CodePudding user response:
I would use str.replace
here:
df.columns = df.columns.str.replace(r'.*~', '', regex=True)
output:
Index(['user_id', 'Sep-05 - Sep-11', 'Aug-29 - Sep-04', 'Aug-22 - Aug-28'], dtype='object')
input:
Index(['user_id', 'Week 36~Sep-05 - Sep-11', 'Week 35~Aug-29 - Sep-04',
'Week 34~Aug-22 - Aug-28'],
dtype='object')
You approach would work with -1 indexing:
df.columns = [col.split('~')[-1] for col in df.columns]
CodePudding user response:
It is happening because you are trying to split user_id using ~
and there is nothing at index 1
Try this
df.columns = [col.split('~')[1] if col.startswith('Week') else col for col in df.columns]