Home > Software design >  How can I merge aggregate two dataframes in Pandas while subtracting column values?
How can I merge aggregate two dataframes in Pandas while subtracting column values?

Time:06-28

I'm working on a rudimentary inventory system and am having trouble finding a solution to this obstacle. I've got two Pandas dataframes, both sharing two columns: PLU and QTY. PLU acts as an item identifier, and QTY is the quantity of the item in one dataframe, while being the quantity sold in another. Here are two very simple examples of what the data looks like:

final_purch:

PLU               QTY
12345678          12
90123456          7
78901234          2

pmix_diff:

PLU               QTY
12345678          9
90123456          3
78901234          1

In this case, I'd want to find any matching PLUs and subtract the pmix_df QTY from the final_purch QTY.

In an earlier part of the project, I used aggregate functions to get rid of duplicates while summing the QTY column. It worked great, but I can't find a way to do something similar here with subtraction. I'm fairly new to Python/Pandas, so any help is greatly appreciated. :)

CodePudding user response:

here is one way to do that Using assign and merge

df.assign(QTY = df['QTY'] - df.merge(df2, on='PLU', suffixes=('','_y'), how='left')['QTY_y'].fillna(0))  
    PLU     QTY
0   12345678    3
1   90123456    4
2   78901234    1

CodePudding user response:

You may do:

df = final_purch.set_index('PLU').join(pmix_df.set_index('PLU'), lsuffix='final', rsuffix='pmix')

df['QTYdiff'] = df['QTYfinal']-df['QTYpmix']

output:

          QTYfinal  QTYpmix  QTYdiff
PLU                                 
12345678        12        9        3
90123456         7        3        4
78901234         2        1        1
          
  • Related