Pandas: df.diff() into new column in a multi column dataframe-CodePudding

I have a DataFrame which is like that:

date	Col.A	Col.B	Col.C	...
2022	1	1	1	...
2023	2	2	2	...
2024	3	3	3	...

And I want to add Columns where the Diff is calculated and looks like that afterwards:

date	Col.A	Diff Col.A	Col.B	Diff Col.B	Col.C	Diff Col.C	...
2022	1	nan	1	nan	1	nan	...
2023	2	1	2	1	2	1	...
2024	3	1	3	1	3	1	...

I tried it like that, but it doesnt work

df = pd.read_excel('example.xlsx', header=0).set_index(['date])

df.diff()

How can I do that in Pandas

CodePudding user response：

I rename new columns names by DataFrame.add_suffix for correct sorting without specify all columns names:

df = pd.read_excel('example.xlsx', header=0, index_col=['date'])

df = pd.concat([df, df.diff().add_suffix(' diff')], axis=1).sort_index(axis=1)

print (df)
      Col.A  Col.A diff  Col.B  Col.B diff  Col.C  Col.C diff
date                                                         
2022      1         NaN      1         NaN      1         NaN
2023      2         1.0      2         1.0      2         1.0
2024      3         1.0      3         1.0      3         1.0

CodePudding user response：

df = pd.concat([df,df.diff().add_prefix('Diff ')], axis=1)
df = df.reindex(columns=sorted(df.columns, key=lambda x: (x.split(' ')[-1], x.split(' ')[0])))
df
###
   date  Col.A  Diff Col.A  Col.B  Diff Col.B  Col.C  Diff Col.C
0  2022      1         NaN      1         NaN      1         NaN
1  2023      2         1.0      2         1.0      2         1.0
2  2024      3         1.0      3         1.0      3         1.0