I have two Dataframes, and would like to create a new column in DataFrame 1 based on DataFrame 2 values.
But I dont want to join the two dataframes per say and make one big dataframe, but rather use the second Dataframe simply as a look-up.
#Main Dataframe:
df1 = pd.DataFrame({'Size':["Big", "Medium", "Small"], 'Sold_Quantity':[10, 6, 40]})
#Lookup Dataframe
df2 = pd.DataFrame({'Size':["Big", "Medium", "Small"], 'Sold_Quantiy_Score_Mean':[10, 20, 30]})
#Create column in Dataframe 1 based on lookup dataframe values:
df1['New_Column'] = when df1['Size'] = df2['Size'] and df1['Sold_Quantity'] < df2['Sold_Quantiy_Score_Mean'] then 'Below Average Sales' else 'Above Average Sales!' end
CodePudding user response:
One approach, is to use np.where
:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantity': [10, 6, 40]})
df2 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantiy_Score_Mean': [10, 20, 30]})
condition = (df1['Size'] == df2['Size']) & (df1['Sold_Quantity'] < df2['Sold_Quantiy_Score_Mean'])
df1['New_Column'] = np.where(condition, 'Below Average Sales', 'Above Average Sales!')
print(df1)
Output
Size Sold_Quantity New_Column
0 Big 10 Above Average Sales!
1 Medium 6 Below Average Sales
2 Small 40 Above Average Sales!
CodePudding user response:
Given that df2
is sort of like a lookup based on Size, it would make sense if your Size column was its index:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantity': [10, 6, 40]})
df2 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantiy_Score_Mean': [10, 20, 30]})
lookup = df2.set_index("Size")
You can then map the Sizes in df1
to their mean and compare each with the sold quantity:
is_below_mean = df1["Sold_Quantity"] < df1["Size"].map(lookup["Sold_Quantiy_Score_Mean"])
and finally map the boolean values to the respective strings using np.where
df1["New_Column"] = np.where(is_below_mean, 'Below Average Sales', 'Above Average Sales!')
df1:
Size Sold_Quantity New_Column
0 Big 10 Above Average Sales!
1 Medium 6 Below Average Sales
2 Small 40 Above Average Sales!