How can i replace part of a column name with a dict in python?-CodePudding

I have column names such as :

ab_sells_cde_104_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm

i also have a dict like:

dict_2 = {'115': 'SERVICES',
 '116': 'HEA',
 '117': 'HOME',
 '104': 'AI'}

I wish to replace the 3 digits in column headers with this dict. I am unsure how i have tried:

df.replace(dict_2, regex=True)

This deoesn't do anything. How can i do this?

CodePudding user response：

You can use a simple list comprehension with re.sub:

import re
df.columns = [re.sub('\d ', lambda m: dict_2.get(m.group(0), m.group(0)), s)
              for s in df.columns]

output:

  ab_sells_cde_AI_frm ab_sells_cde_105_frm ab_sells_cde_120_frm
0                  NA                   NA                   NA

CodePudding user response：

Use Index.to_series for convert columns to Series with Series.replace:

c = 'ab_sells_cde_104_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm'.split(', ')
df = pd.DataFrame(columns=c)

dict_2 = {'115': 'SERVICES',
 '116': 'HEA',
 '117': 'HOME',
 '104': 'AI'} 

df.columns = df.columns.to_series().replace(dict_2, regex=True)
print (df)
Empty DataFrame
Columns: [ab_sells_cde_AI_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm]
Index: []

Or if need replace by call back by str.replace:

df.columns = df.columns.str.replace('\d ', lambda x: dict_2.get(x.group(0), x.group(0)), regex=True)

print (df)
Empty DataFrame
Columns: [ab_sells_cde_AI_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm]
Index: []

Replace with inteegrs with length 3:

df.columns = df.columns.str.replace('\d{3}', lambda x: dict_2.get(x.group(0), x.group(0)), regex=True)

print (df)
Empty DataFrame
Columns: [ab_sells_cde_AI_frm, ab_sells_cde_105_frm, ab_sells_cde_120_frm]
Index: []