Combine two seperate dataframes without changing headers-CodePudding

I just wanted to put the max&avg above each column header. If thats a easier way, please let me know.

pusedocode> for each column , get the max, get the avg, , put the max and avg above each column header as seperate df

Idk how to do the above^, if u do lmk lol.

I have a df w/ 2 columns called df. I have used the .describe() to get a transposed version of its stats. I assigned that to a new df called describe_df.

I now want to display both dataframes together(not append, not combine etc.) so i can export it to excel. my code and both dataframes

CodePudding user response：

I assume you are exporting this immediately after for a report style function, as you normally would not want to do this in pandas.

result = pd.concat([describe_df,df],axis=0)

CodePudding user response：

This will create a dataframe of the kind I believe your question is asking for:

import pandas as pd
import datetime
df = pd.DataFrame({
'date and time':[datetime.datetime.now()   datetime.timedelta(minutes=i) for i in range(10)],
'PowerAC':[10.0   i / 5 for i in range(10)]
})
print(df)

df1 = df.set_index('date and time')
df2 = pd.concat([pd.DataFrame(index=pd.Index(['max', 'avg'], name=df1.index.names[0]), columns=df1.columns, data=[df1.max(), df1.mean()]), df1], axis=0).reset_index()
print(df2)

Sample input:

               date and time  PowerAC
0 2022-05-31 13:21:01.518330     10.0
1 2022-05-31 13:22:01.518330     10.2
2 2022-05-31 13:23:01.518330     10.4
3 2022-05-31 13:24:01.518330     10.6
4 2022-05-31 13:25:01.518330     10.8
5 2022-05-31 13:26:01.518330     11.0
6 2022-05-31 13:27:01.518330     11.2
7 2022-05-31 13:28:01.518330     11.4
8 2022-05-31 13:29:01.518330     11.6
9 2022-05-31 13:30:01.518330     11.8

Sample output:

                 date and time  PowerAC
0                          max     11.8
1                          avg     10.9
2   2022-05-31 13:21:01.518330     10.0
3   2022-05-31 13:22:01.518330     10.2
4   2022-05-31 13:23:01.518330     10.4
5   2022-05-31 13:24:01.518330     10.6
6   2022-05-31 13:25:01.518330     10.8
7   2022-05-31 13:26:01.518330     11.0
8   2022-05-31 13:27:01.518330     11.2
9   2022-05-31 13:28:01.518330     11.4
10  2022-05-31 13:29:01.518330     11.6
11  2022-05-31 13:30:01.518330     11.8

If you want to use the entire df.describe() output in this way (not just max and mean), you can do this:

df1 = df.set_index('date and time')
desc = df.describe()
desc.index.names = df1.index.names
df2 = pd.concat([desc, df1], axis=0).reset_index()
print(df2)

Sample output:

                 date and time   PowerAC
0                        count  10.00000
1                         mean  10.90000
2                          std   0.60553
3                          min  10.00000
4                          25%  10.45000
5                          50%  10.90000
6                          75%  11.35000
7                          max  11.80000
8   2022-05-31 13:26:52.881477  10.00000
9   2022-05-31 13:27:52.881477  10.20000
10  2022-05-31 13:28:52.881477  10.40000
11  2022-05-31 13:29:52.881477  10.60000
12  2022-05-31 13:30:52.881477  10.80000
13  2022-05-31 13:31:52.881477  11.00000
14  2022-05-31 13:32:52.881477  11.20000
15  2022-05-31 13:33:52.881477  11.40000
16  2022-05-31 13:34:52.881477  11.60000
17  2022-05-31 13:35:52.881477  11.80000

As others have indicated, there are also alternative ways to store multiple dataframes in Excel that don't require combining your tabular data into a single dataframe which may be preferable.