Home > Net >  Pandas: Merge two dataframe with number of times the value in a column
Pandas: Merge two dataframe with number of times the value in a column

Time:02-02

have got dataframe df

Aisle   Table_no    Table_Bit   
11          2           1
11          2           2
11          2           3
11          3           1
11          3           2
11          3           3
14          2           1
14          2           2
14          2           3

and another spc_df

Aisle   Table_no    Item    Item_time   Space
11      2           Mango       2       0.25
11      2           Lemon       1       0.125
11      3           Apple       3       0.75
14      2           Orange      1       0.125
14      2           Melon       2       0.25

need to add columns spc_df['Item'] and spc_df['Space'] to dataframe df with number of times the value in spc_df['Item_time'] as given in expected output. Additional note(may/maynot used for logic): Sum of Item_time for every Aisle-Table_no will be max of Table_Bit for that Aisle-Table_no combination.

Expected Output:

Aisle   Table_no    Table_Bit   Item    Space
11          2           1       Mango   0.25        
11          2           2       Mango   0.25                
11          2           3       Lemon   0.125       
11          3           1       Apple   0.75
11          3           2       Apple   0.75
11          3           3       Apple   0.75
14          2           1       Orange  0.125
14          2           2       Melon   0.25
14          2           3       Melon   0.25

CodePudding user response:

First repeat values in spc_df by Item_time with add counter column by GroupBy.cumcount, so possible left join by original df if Table_Bit is counter column starting by 1 per groups:

df2 = (spc_df.loc[spc_df.index.repeat(spc_df['Item_time'])]
           .assign(Table_Bit = lambda x: x.groupby(['Aisle','Table_no']).cumcount().add(1)))

df = df.merge(df2, how='left')
print (df)
   Aisle  Table_no  Table_Bit    Item  Item_time  Space
0     11         2          1   Mango          2  0.250
1     11         2          2   Mango          2  0.250
2     11         2          3   Lemon          1  0.125
3     11         3          1   Apple          3  0.750
4     11         3          2   Apple          3  0.750
5     11         3          3   Apple          3  0.750
6     14         2          1  Orange          1  0.125
7     14         2          2   Melon          2  0.250
8     14         2          3   Melon          2  0.250

If not Table_ID is counter column create helper column new:

df2 = (spc_df.loc[spc_df.index.repeat(spc_df['Item_time'])]
           .assign(new= lambda x: x.groupby(['Aisle','Table_no']).cumcount()))

df = (df.assign(new = df.groupby(['Aisle','Table_no']).cumcount())
        .merge(df2, how='left')
        .drop('new', axis=1))
print (df)
   Aisle  Table_no  Table_Bit    Item  Item_time  Space
0     11         2          1   Mango          2  0.250
1     11         2          2   Mango          2  0.250
2     11         2          3   Lemon          1  0.125
3     11         3          1   Apple          3  0.750
4     11         3          2   Apple          3  0.750
5     11         3          3   Apple          3  0.750
6     14         2          1  Orange          1  0.125
7     14         2          2   Melon          2  0.250
8     14         2          3   Melon          2  0.250
  • Related