Home > Back-end >  Renaming some part of columns of dataframe with values from another dataframe
Renaming some part of columns of dataframe with values from another dataframe

Time:02-26

I want to change the column names from another DataFrame.

There are some similar questions in stackoverflow, but I need advanced version of it.

data1 = {
  "ABC-123_afd": [420, 380, 390],
  "LFK-402_ote": [50, 40, 45],
  "BPM-299_qbm": [50, 40, 45],
}

data2 = {
  "ID": ['ABC-123', 'LFK-402', 'BPM-299'],
  "NewID": ['IQU', 'EUW', 'NMS']
}

data1_df=pd.DataFrame(data1)
#   ABC-123_afd  LFK-402_ote  BPM-299_qbm
#0          420           50           50
#1          380           40           40
#2          390           45           45

data2_df=pd.DataFrame(data2)
#        ID NewID
#0  ABC-123   IQU
#1  LFK-402   EUW
#2  BPM-299   NMS

I want to make the final result as below:

data_final_df
#   IQU_afd  EUW_ote  NMS_qbm
#0      420       50       50
#1      380       40       40
#2      390       45       45

I tried the code in Renaming columns of dataframe with values from another dataframe.

It ran without error, but there were no changes. I think column names in data 1 are not perfectly matched to the value in the data2 value.

How can I change some part of the column name from another pandas DataFrame?

CodePudding user response:

We could create a mapping from "ID" to "NewID" and use it to modify column names:

mapping = dict(zip(data2['ID'], data2['NewID']))
data1_df.columns = [mapping[x]   '_'   y for x, y in data1_df.columns.str.split('_')]
print(data1_df)

or

s = data1_df.columns.str.split('_')
data1_df.columns = s.str[0].map(mapping)   '_'   s.str[1]

or use the DataFrame data2_df:

s = data1_df.columns.str.split('_')
data1_df.columns = s.str[0].map(data2_df.set_index('ID')['NewID'])   '_'   s.str[1]

Output:

   IQU_afd  EUW_ote  NMS_qbm
0      420       50       50
1      380       40       40
2      390       45       45

CodePudding user response:

One option is to use replace:

mapping = dict(zip(data2['ID'], data2['NewID']))
s = pd.Series(data1_df.columns)
data1_df.columns = s.replace(regex = mapping)
data1_df

   IQU_afd  EUW_ote  NMS_qbm
0      420       50       50
1      380       40       40
2      390       45       45
  • Related