Home > Software design >  Subtracting two pandas dataframes with different shapes
Subtracting two pandas dataframes with different shapes

Time:09-22

I have two data frames df1 is like (AA is the index)

AA 
a  1 2 3 4 5
b  2 2 3 4 5 

and df2 is like (AA is the index)

AA 
a  10
b  20 

output should be (all values of df1-single value of df2 at row wise matching Index column AA)

AA 
a  -9 -8 -7 -6 -5
b  -18 -18 -17 -16 -15

I tried doing the saame in many ways. Could someone in the group please help me on this? Thank you

CodePudding user response:

You can use the apply function to solve it.Here is the example code.

import pandas as pd
df1 = pd.DataFrame(
    [
        [1, 2, 3, 4, 5],
        [2, 2, 3, 4, 5]
    ], index=['a', 'b']
)
df2 = pd.DataFrame(
    [
        [10],
        [20]
    ], index=['a', 'b']
)
# 0 is the first column index is df2
# x.name is the row index name in df1
df3 = df1.apply(lambda x :x-df2.loc[x.name,0],axis=1)
df3

And you will get output bellow if you run in jupyter.

enter image description here

If not, you should use print(df3) to see the output.

CodePudding user response:

It's easy in NumPy, because of its broadcasting rules. So you could do the calculation on the value arrays (assuming the index values are the same for both input DataFrames, as in your example) and create a new DataFrame from the result:

import pandas as pd

index = pd.Index(['a', 'b'], name='AA')
df1 = pd.DataFrame([[1, 2, 3, 4, 5],
                    [2, 2, 3, 4, 5]],
                   index=index)
df2 = pd.DataFrame([[10],
                    [20]],
                   index=index)                   

df = pd.DataFrame(df1.values - df2.values,
                  index=index) 
df
      0   1   2   3   4
AA                  
a    -9  -8  -7  -6  -5
b   -18 -18 -17 -16 -15
  • Related