I want to replace some characters in column names at scale and I was able to do it through columns with str.replace()
. However, I want to know if I could do this through lambda functions so I could bring them into my other pandas workflow instead of doing it independently.
dat.columns = (
dat.columns
.str.replace(r"park_1_city", "us1state")
.str.replace(r"park_2_city", "us2state")
.str.replace(r"park_3_city", "us3state")
.str.replace(r"us1tree", "us1garden")
.str.replace(r"us2tree", "us2garden")
.str.replace(r"us3tree", "us3garden")
)
CodePudding user response:
Simply do:
your_function = lambda col: col # Or whatever you would like to do with the names
dat.columns = [your_function(col) for col in dat.columns]
You can also use any normal function, instead of a lambda, of course.
CodePudding user response:
Use dictionary for replace subtrings, here \d
match digit and \1
same value in Series.replace
for possible pass dictionary:
dat = pd.DataFrame(columns=['park_1_city','park_2_city','park_3_city',
'us1tree','us2tree','us30tree'])
d = {r"park_(\d )_city": r"us\1state", r"us(\d )tree": r"us\1garden"}
dat.columns = dat.columns.to_series().replace(d, regex=True)
print (dat)
Empty DataFrame
Columns: [us1state, us2state, us3state, us1garden, us2garden, us30garden]
Index: []