I have 2 dataframes:
df1
Time Apples Pears Grapes Peachs
10:00 3 5 5 2
11:00 1 0 2 9
12:00 20 2 7 3
df2
Class Item Factor
A Apples 3
A Peaches 2
A Pears 5
B NaN 4
I want to create a new column in df2 called Total
which item at 10:00 in df1, multiplied by the Factor for that item - this should only be done if they are in Class A.
This is how the final df should look
df2
Class Item Factor Total
A Apples 3 9
A Peaches 2 4
A Pears 5 25
B NaN 4
This is what I tried:
df2['Total'] = df1.setIndex().cols.isin((df2.Item) and (df2.Class==A)) * df2.Factor
CodePudding user response:
IIUC this will get you the desired output (This does not include the np.nan from df2 where it == b, but I don't think you wanted that)
df_melt = df1.melt(id_vars = ['Time'])
df_melt.columns = ['Time', 'Item', 'Count']
df2 = df2.loc[df2['Class'] == 'A']
df_merge = pd.merge(df2, df_melt)
df_merge['Total'] = df_merge['Factor'] * df_merge['Count']
df_merge
CodePudding user response:
You could use df1 as a lookup table:
df2['Total'] = df2[df2['Class']=='A'].apply(
lambda x, **kwd: df1.set_index('Time').loc[kwd['tgt_time'],x['Item']]*x['Factor'],
axis=1, tgt_time='10:00')
Class Item Factor Total
0 A Apples 3 9.0
1 A Peaches 2 4.0
2 A Pears 5 25.0
3 B NaN 4 NaN
(To make this work I had to change the column name from Peachs to Peaches in df1)