I have 3 dataframes (df1, df2 & df3), the main one (df1) and two additional ones which contain 1 column amongst others that I want to bring over to the main dataframe.
Sample dfs:
df1 = {'Col_f': ['Georgia', 'Nevada', 'New York', 'Texas', 'Arizona'],
'Col_g': ['SUV', 'Coupe', 'Wagon', 'Crossover', 'Sedan']}
df1 = pd.DataFrame(df1)
df2 = {'Col_g': ['SUV', '4x4', 'Wagon', 'Truck', 'Sedan'],
'Objective': ['15%', '13%', '55%', '40.4%', '2.48%']}
df2 = pd.DataFrame(df2)
df3 = {'Col_f': ['Georgia', 'California', 'Pennsylvania', 'Texas', 'Arizona'],
'Objective': ['15%', '13%', '55%', '40.4%', '2.48%']}
df3 = pd.DataFrame(df3)
I am using the following code:
df1_new = pd.merge(df1, df2, on = 'Col_g', how = 'left')
Which returns the following df:
df_new = {'Col_f': ['Georgia', 'California', 'Pennsylvania', 'Texas', 'Arizona'],
'Col_g': ['SUV', 'Coupe', 'Wagon', 'Crossover', 'Sedan'],
'Objective': ['15%', ' ', '55%', ' ', '2.48%']}
Then for the two empty strings for the "Objective" column I want to continue with a second merge (or an excel vlookup which I guess are the same), to fill in those empty ones.
Code that I thought would be used for the second merge:
df1_newer = pd.merge(df_new, df3, on = 'Col_f', how = 'left')
Desired final output.
df_newer = {'Col_f': ['Georgia', 'California', 'Pennsylvania', 'Texas', 'Arizona'],
'Col_g': ['SUV', '4x4', 'Wagon', 'Truck', 'Sedan'],
'Objective': ['15%', '13%', '55%', '40.4%', '2.48%']}
Any suggestions would be more than appreciated!
CodePudding user response:
After merging the three dataframes, you can use mask
or np.where
to conditionally assign value
df1_new = pd.merge(df1, df2, on='Col_g', how='left')['Objective']
df1_newer = pd.merge(df1, df3, on='Col_f', how='left')['Objective']
print(df1_new)
0 NaN
1 NaN
2 55%
3 NaN
4 2.48%
Name: Objective, dtype: object
print(df1_newer)
0 15%
1 NaN
2 NaN
3 40.4%
4 2.48%
Name: Objective, dtype: object
df1['Objective'] = df1_new.mask(df1_new.isna(), df1_newer)
#or
import numpy as np
df1['Objective'] = np.where(df1_new.isna(), df1_newer, df1_new)
print(df1)
Col_f Col_g Objective
0 Georgia Minivan 15%
1 Nevada Coupe NaN
2 New York Wagon 55%
3 Texas Crossover 40.4%
4 Arizona Sedan 2.48%