Home > other >  Pandas: df.diff() into new column in a multi column dataframe
Pandas: df.diff() into new column in a multi column dataframe

Time:09-06

I have a DataFrame which is like that:

date Col.A Col.B Col.C ...
2022 1 1 1 ...
2023 2 2 2 ...
2024 3 3 3 ...

And I want to add Columns where the Diff is calculated and looks like that afterwards:

date Col.A Diff Col.A Col.B Diff Col.B Col.C Diff Col.C ...
2022 1 nan 1 nan 1 nan ...
2023 2 1 2 1 2 1 ...
2024 3 1 3 1 3 1 ...

I tried it like that, but it doesnt work

df = pd.read_excel('example.xlsx', header=0).set_index(['date])

df.diff()

How can I do that in Pandas

CodePudding user response:

I rename new columns names by DataFrame.add_suffix for correct sorting without specify all columns names:

df = pd.read_excel('example.xlsx', header=0, index_col=['date'])

df = pd.concat([df, df.diff().add_suffix(' diff')], axis=1).sort_index(axis=1)

print (df)
      Col.A  Col.A diff  Col.B  Col.B diff  Col.C  Col.C diff
date                                                         
2022      1         NaN      1         NaN      1         NaN
2023      2         1.0      2         1.0      2         1.0
2024      3         1.0      3         1.0      3         1.0

CodePudding user response:

df = pd.concat([df,df.diff().add_prefix('Diff ')], axis=1)
df = df.reindex(columns=sorted(df.columns, key=lambda x: (x.split(' ')[-1], x.split(' ')[0])))
df
###
   date  Col.A  Diff Col.A  Col.B  Diff Col.B  Col.C  Diff Col.C
0  2022      1         NaN      1         NaN      1         NaN
1  2023      2         1.0      2         1.0      2         1.0
2  2024      3         1.0      3         1.0      3         1.0
  • Related