How to create a new dataframe that contains the value changes from multiple columns between two exis-CodePudding

I am looking at football player development over a five year period.

I have two dataframes (DFs), one that contains all 20 year-old strikers from FIFA 17 and another that contains all 25 year-old strikers from FIFA 22. I want to create a third DF that contains the attribute changes for each player. There are about 30 columns denoting each attribute, e.g. tackling, shooting, passing etc. So I want the new DF to contain 3 for tackling, 2 for shooting, 6 for passing etc.

The best way of solving this that I can think of is by merging the two DFs and then applying a function to every column that gives the difference between the x and y values, which represent the FIFA 17 and FIFA 22 data respectively.

Any tips much appreciated. Thank you.

CodePudding user response：

You might subtract pandas.DataFrames consider following simple example

import pandas as pd
df1 = pd.DataFrame({'X':[1,2],'Y':[3,4]})
df2 = pd.DataFrame({'X':[10,20],'Y':[30,40]})
dfdiff = df2 - df1
print(dfdiff)

gives output

    X   Y
0   9  27
1  18  36

CodePudding user response：

I have found a solution but it is very tedious as it requires a line of code for each and every attribute.

I'm simply assigning a new column for each attribute change. So for Passing, for instance, the code is:

mergedDF = mergedDF.assign(PassingChange = mergedDF.Passing_x - mergedDF.Passing_y)

If someone can help me find a more elegant solution I'd be very grateful!