I have column names such as :
ab_sells_cde_104_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm
i also have a dict like:
dict_2 = {'115': 'SERVICES',
'116': 'HEA',
'117': 'HOME',
'104': 'AI'}
I wish to replace the 3 digits in column headers with this dict. I am unsure how i have tried:
df.replace(dict_2, regex=True)
This deoesn't do anything. How can i do this?
CodePudding user response:
You can use a simple list comprehension with re.sub
:
import re
df.columns = [re.sub('\d ', lambda m: dict_2.get(m.group(0), m.group(0)), s)
for s in df.columns]
output:
ab_sells_cde_AI_frm ab_sells_cde_105_frm ab_sells_cde_120_frm
0 NA NA NA
CodePudding user response:
Use Index.to_series
for convert columns to Series
with Series.replace
:
c = 'ab_sells_cde_104_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm'.split(', ')
df = pd.DataFrame(columns=c)
dict_2 = {'115': 'SERVICES',
'116': 'HEA',
'117': 'HOME',
'104': 'AI'}
df.columns = df.columns.to_series().replace(dict_2, regex=True)
print (df)
Empty DataFrame
Columns: [ab_sells_cde_AI_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm]
Index: []
Or if need replace by call back by str.replace
:
df.columns = df.columns.str.replace('\d ', lambda x: dict_2.get(x.group(0), x.group(0)), regex=True)
print (df)
Empty DataFrame
Columns: [ab_sells_cde_AI_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm]
Index: []
Replace with inteegrs with length 3
:
df.columns = df.columns.str.replace('\d{3}', lambda x: dict_2.get(x.group(0), x.group(0)), regex=True)
print (df)
Empty DataFrame
Columns: [ab_sells_cde_AI_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm]
Index: []