Home > OS >  How to create a new dataframe that contains the value changes from multiple columns between two exis
How to create a new dataframe that contains the value changes from multiple columns between two exis

Time:06-30

I am looking at football player development over a five year period.

I have two dataframes (DFs), one that contains all 20 year-old strikers from FIFA 17 and another that contains all 25 year-old strikers from FIFA 22. I want to create a third DF that contains the attribute changes for each player. There are about 30 columns denoting each attribute, e.g. tackling, shooting, passing etc. So I want the new DF to contain 3 for tackling, 2 for shooting, 6 for passing etc.

The best way of solving this that I can think of is by merging the two DFs and then applying a function to every column that gives the difference between the x and y values, which represent the FIFA 17 and FIFA 22 data respectively.

Any tips much appreciated. Thank you.

CodePudding user response:

You might subtract pandas.DataFrames consider following simple example

import pandas as pd
df1 = pd.DataFrame({'X':[1,2],'Y':[3,4]})
df2 = pd.DataFrame({'X':[10,20],'Y':[30,40]})
dfdiff = df2 - df1
print(dfdiff)

gives output

    X   Y
0   9  27
1  18  36

CodePudding user response:

I have found a solution but it is very tedious as it requires a line of code for each and every attribute.

I'm simply assigning a new column for each attribute change. So for Passing, for instance, the code is:

mergedDF = mergedDF.assign(PassingChange = mergedDF.Passing_x - mergedDF.Passing_y)

If someone can help me find a more elegant solution I'd be very grateful!

  • Related