Home > Software engineering >  Join two df and their columns
Join two df and their columns

Time:10-21

I have two dataframe like:

import pandas as pd

df1 = pd.DataFrame({'Airplanes' : ['U-2','B-52,P-51', 'F-16', 'MiG-21,F-16;A-10', 'P-51','A-10;P-51' ],
                    'Company' : ['Air_1', 'Air_3', 'Air_2','Air_1', 'Air_7', 'Air_3']})
------------------------------
          Airplanes Company
0               U-2   Air_1
1         B-52,P-51   Air_3
2              F-16   Air_2
3  MiG-21,F-16;A-10   Air_1
4              P-51   Air_7
5         A-10;P-51   Air_3
-------------------------------

df2 = pd.DataFrame({'Model' : ['U-2','B-52', 'F-16', 'MiG-21', 'P-51','A-10' ],
                    'Description' : ['Strong', 'Huge', 'Quick','Light', 'Silent', 'Comfortable']})
------------------------------
     Model  Description
0     U-2       Strong
1    B-52         Huge
2    F-16        Quick
3  MiG-21        Light
4    P-51       Silent
5    A-10  Comfortable
------------------------------

I would like to insert the information of df2 inside df1. In particular, the Description column must appear in df1, respecting the separators of the df1 column ['Airplanes']. So in this case the output should be:

---------------------------------------------------------
          Airplanes Company                 Description
0               U-2   Air_1                       Srong
1         B-52,P-51   Air_3                 Huge,Silent
2              F-16   Air_2                       Quick
3  MiG-21,F-16;A-10   Air_1     Light,Quick;Comfortable
4              P-51   Air_7                      Silent
5         A-10;P-51   Air_3          Comfortable;Silent
--------------------------------------------------------

How can I do?

CodePudding user response:

You can use a regex and str.replace:

mapper = df2.set_index('Model')['Description'].to_dict()
regex = '|'.join(df2['Model'])
# 'U-2|B-52|F-16|MiG-21|P-51|A-10'

df1['Description'] = df1['Airplanes'].str.replace(regex, lambda m: mapper.get(m.group()))

output:

          Airplanes Company              Description
0               U-2   Air_1                   Strong
1         B-52,P-51   Air_3              Huge,Silent
2              F-16   Air_2                    Quick
3  MiG-21,F-16;A-10   Air_1  Light,Quick;Comfortable
4              P-51   Air_7                   Silent
5         A-10;P-51   Air_3       Comfortable;Silent

CodePudding user response:

You could do split with explode then map

df1['new'] = df1.Airplanes.str.split('[,|;]').explode().map(df2.set_index('Model')['Description']).groupby(level=0).agg(','.join)
df1
Out[62]: 
          Airplanes Company                      new
0               U-2   Air_1                   Strong
1         B-52,P-51   Air_3              Huge,Silent
2              F-16   Air_2                    Quick
3  MiG-21,F-16;A-10   Air_1  Light,Quick,Comfortable
4              P-51   Air_7                   Silent
5         A-10;P-51   Air_3       Comfortable,Silent

CodePudding user response:

d=dict(df2.values)

df1['description']=df1['Airplanes'].str.split('[,|;]').apply(lambda x: ','.join([d[i] for i in x]))
df1

    Airplanes   Company     description
0   U-2               Air_1     Strong
1   B-52,P-51         Air_3     Huge,Silent
2   F-16              Air_2     Quick
3   MiG-21,F-16,A-10  Air_1     Light,Quick,Comfortable
4   P-51              Air_7     Silent
5   A-10,P-51         Air_3     Comfortable,Silent```
  • Related