I have two dataframe like:
import pandas as pd
df1 = pd.DataFrame({'Airplanes' : ['U-2','B-52,P-51', 'F-16', 'MiG-21,F-16;A-10', 'P-51','A-10;P-51' ],
'Company' : ['Air_1', 'Air_3', 'Air_2','Air_1', 'Air_7', 'Air_3']})
------------------------------
Airplanes Company
0 U-2 Air_1
1 B-52,P-51 Air_3
2 F-16 Air_2
3 MiG-21,F-16;A-10 Air_1
4 P-51 Air_7
5 A-10;P-51 Air_3
-------------------------------
df2 = pd.DataFrame({'Model' : ['U-2','B-52', 'F-16', 'MiG-21', 'P-51','A-10' ],
'Description' : ['Strong', 'Huge', 'Quick','Light', 'Silent', 'Comfortable']})
------------------------------
Model Description
0 U-2 Strong
1 B-52 Huge
2 F-16 Quick
3 MiG-21 Light
4 P-51 Silent
5 A-10 Comfortable
------------------------------
I would like to insert the information of df2 inside df1. In particular, the Description column must appear in df1, respecting the separators of the df1 column ['Airplanes']. So in this case the output should be:
---------------------------------------------------------
Airplanes Company Description
0 U-2 Air_1 Srong
1 B-52,P-51 Air_3 Huge,Silent
2 F-16 Air_2 Quick
3 MiG-21,F-16;A-10 Air_1 Light,Quick;Comfortable
4 P-51 Air_7 Silent
5 A-10;P-51 Air_3 Comfortable;Silent
--------------------------------------------------------
How can I do?
CodePudding user response:
You can use a regex and str.replace
:
mapper = df2.set_index('Model')['Description'].to_dict()
regex = '|'.join(df2['Model'])
# 'U-2|B-52|F-16|MiG-21|P-51|A-10'
df1['Description'] = df1['Airplanes'].str.replace(regex, lambda m: mapper.get(m.group()))
output:
Airplanes Company Description
0 U-2 Air_1 Strong
1 B-52,P-51 Air_3 Huge,Silent
2 F-16 Air_2 Quick
3 MiG-21,F-16;A-10 Air_1 Light,Quick;Comfortable
4 P-51 Air_7 Silent
5 A-10;P-51 Air_3 Comfortable;Silent
CodePudding user response:
You could do split
with explode
then map
df1['new'] = df1.Airplanes.str.split('[,|;]').explode().map(df2.set_index('Model')['Description']).groupby(level=0).agg(','.join)
df1
Out[62]:
Airplanes Company new
0 U-2 Air_1 Strong
1 B-52,P-51 Air_3 Huge,Silent
2 F-16 Air_2 Quick
3 MiG-21,F-16;A-10 Air_1 Light,Quick,Comfortable
4 P-51 Air_7 Silent
5 A-10;P-51 Air_3 Comfortable,Silent
CodePudding user response:
d=dict(df2.values)
df1['description']=df1['Airplanes'].str.split('[,|;]').apply(lambda x: ','.join([d[i] for i in x]))
df1
Airplanes Company description
0 U-2 Air_1 Strong
1 B-52,P-51 Air_3 Huge,Silent
2 F-16 Air_2 Quick
3 MiG-21,F-16,A-10 Air_1 Light,Quick,Comfortable
4 P-51 Air_7 Silent
5 A-10,P-51 Air_3 Comfortable,Silent```