I want to change the column names from another DataFrame.
There are some similar questions in stackoverflow, but I need advanced version of it.
data1 = {
"ABC-123_afd": [420, 380, 390],
"LFK-402_ote": [50, 40, 45],
"BPM-299_qbm": [50, 40, 45],
}
data2 = {
"ID": ['ABC-123', 'LFK-402', 'BPM-299'],
"NewID": ['IQU', 'EUW', 'NMS']
}
data1_df=pd.DataFrame(data1)
# ABC-123_afd LFK-402_ote BPM-299_qbm
#0 420 50 50
#1 380 40 40
#2 390 45 45
data2_df=pd.DataFrame(data2)
# ID NewID
#0 ABC-123 IQU
#1 LFK-402 EUW
#2 BPM-299 NMS
I want to make the final result as below:
data_final_df
# IQU_afd EUW_ote NMS_qbm
#0 420 50 50
#1 380 40 40
#2 390 45 45
I tried the code in Renaming columns of dataframe with values from another dataframe.
It ran without error, but there were no changes. I think column names in data 1 are not perfectly matched to the value in the data2 value.
How can I change some part of the column name from another pandas DataFrame?
CodePudding user response:
We could create a mapping from "ID" to "NewID" and use it to modify column names:
mapping = dict(zip(data2['ID'], data2['NewID']))
data1_df.columns = [mapping[x] '_' y for x, y in data1_df.columns.str.split('_')]
print(data1_df)
or
s = data1_df.columns.str.split('_')
data1_df.columns = s.str[0].map(mapping) '_' s.str[1]
or use the DataFrame data2_df
:
s = data1_df.columns.str.split('_')
data1_df.columns = s.str[0].map(data2_df.set_index('ID')['NewID']) '_' s.str[1]
Output:
IQU_afd EUW_ote NMS_qbm
0 420 50 50
1 380 40 40
2 390 45 45
CodePudding user response:
One option is to use replace
:
mapping = dict(zip(data2['ID'], data2['NewID']))
s = pd.Series(data1_df.columns)
data1_df.columns = s.replace(regex = mapping)
data1_df
IQU_afd EUW_ote NMS_qbm
0 420 50 50
1 380 40 40
2 390 45 45