Home > OS >  Substracting values from two different pandas dataframes based on same row values
Substracting values from two different pandas dataframes based on same row values

Time:09-24

I have data of two dataframes First dataframe:

Name    price  date
Apple    25   23-9-2021
Orange   35   23-9-2021
Banana   15   23-9-2021
Gauva    10   23-9-2021
Grapes   5    23-9-2021
Sapota   15   22-9-2021
Papaya   33   21-9-2021

Second dataframe:

Name   price   date
Apple   30     23-9-2021
Orange  45     23-9-2021
Banana  30     23-9-2021
Gauva   20     23-9-2021
Grapes  25     23-9-2021

I want to substract price value of two dataframes based on row values of same Name and Date and store it into first dataframe

Output: first dataframe

Name   price   date
Apple   25     23-9-2021
Orange  35     23-9-2021
Banana  15     23-9-2021
Gauva   10     23-9-2021
Grapes  5      23-9-2021
Sapota  15     22-9-2021
Papaya  33     21-9-2021
Apple   5      23-9-2021
Orange  10     23-9-2021
Banana  15     23-9-2021
Gauva   10     23-9-2021
Grapes  20     23-9-2021

I have more than hundred of such fruitnames. I don't know how to do this

CodePudding user response:

  1. merge both dataframes
  2. Find the difference in price
  3. concat
df3 = df2.merge(df1, how="left", on=["Name", "date"])
df3["price"] = df3["price_x"]-df3["price_y"]

output = pd.concat([df1, df3.drop(["price_x", "price_y"], axis=1)])

>>> output
     Name  price       date
0   Apple     25  23-9-2021
1  Orange     35  23-9-2021
2  Banana     15  23-9-2021
3   Gauva     10  23-9-2021
4  Grapes      5  23-9-2021
5  Sapota     15  22-9-2021
6  Papaya     33  21-9-2021
0   Apple      5  23-9-2021
1  Orange     10  23-9-2021
2  Banana     15  23-9-2021
3   Gauva     10  23-9-2021
4  Grapes     20  23-9-2021

CodePudding user response:

You can use set_index instead of merge:

out = pd.concat([df1, df2.set_index(['Name', 'date'])['price']
                         .sub(df1.set_index(['Name', 'date'])['price'])
                         .dropna().reset_index()], ignore_index=True)

Output:

>>> out

      Name  price       date
0    Apple   25.0  23-9-2021
1   Orange   35.0  23-9-2021
2   Banana   15.0  23-9-2021
3    Gauva   10.0  23-9-2021
4   Grapes    5.0  23-9-2021
5   Sapota   15.0  22-9-2021
6   Papaya   33.0  21-9-2021
7    Apple    5.0  23-9-2021
8   Banana   15.0  23-9-2021
9    Gauva   10.0  23-9-2021
10  Grapes   20.0  23-9-2021
11  Orange   10.0  23-9-2021
  • Related