Home > Software engineering >  Create new column based on other columns from a different dataframe
Create new column based on other columns from a different dataframe

Time:06-02

I have 2 dataframes:

df1

Time   Apples   Pears   Grapes   Peachs
10:00    3       5        5        2
11:00    1       0        2        9
12:00    20      2        7        3

df2

Class   Item   Factor  
A       Apples   3
A       Peaches  2
A       Pears    5
B       NaN      4

I want to create a new column in df2 called Total which item at 10:00 in df1, multiplied by the Factor for that item - this should only be done if they are in Class A.

This is how the final df should look

df2

Class   Item   Factor   Total 
A       Apples   3        9
A       Peaches  2        4
A       Pears    5        25
B       NaN      4

This is what I tried:

df2['Total'] = df1.setIndex().cols.isin((df2.Item) and (df2.Class==A)) * df2.Factor

CodePudding user response:

IIUC this will get you the desired output (This does not include the np.nan from df2 where it == b, but I don't think you wanted that)

df_melt = df1.melt(id_vars = ['Time'])
df_melt.columns = ['Time', 'Item', 'Count']
df2 = df2.loc[df2['Class'] == 'A']
df_merge = pd.merge(df2, df_melt)
df_merge['Total'] = df_merge['Factor'] * df_merge['Count']
df_merge

CodePudding user response:

You could use df1 as a lookup table:

df2['Total'] = df2[df2['Class']=='A'].apply( 
      lambda x, **kwd: df1.set_index('Time').loc[kwd['tgt_time'],x['Item']]*x['Factor'],
          axis=1, tgt_time='10:00')

  Class     Item  Factor  Total
0     A   Apples       3    9.0
1     A  Peaches       2    4.0
2     A    Pears       5   25.0
3     B      NaN       4    NaN

(To make this work I had to change the column name from Peachs to Peaches in df1)

  • Related