I have transformed a dataset that has two categorical variables, Name and Year, into dummy variables. As a result I have 433 columns and I would like to know if there's a way to remove the words "Name_" and "Year_" without having to rename all of them by hand.
The only results I've seen are to manually rename all columns. Is there a way to do this like if one were to remove certain keywords from a string/URL links within text?
CodePudding user response:
Using a regex:
df.columns = df.columns.str.replace('^(Name|Year)_', '', regex=True)
CodePudding user response:
Might be more concise if you use a regex, but this should work:
out = df.rename(columns=lambda x: x[5:] if x.startswith("Name_") or x.startswith("Year_") else x)