I just wanted to put the max&avg above each column header. If thats a easier way, please let me know.
pusedocode> for each column , get the max, get the avg, , put the max and avg above each column header as seperate df
Idk how to do the above^, if u do lmk lol.
I have a df w/ 2 columns called df. I have used the .describe() to get a transposed version of its stats. I assigned that to a new df called describe_df.
I now want to display both dataframes together(not append, not combine etc.) so i can export it to excel. my code and both dataframes
CodePudding user response:
I assume you are exporting this immediately after for a report style function, as you normally would not want to do this in pandas.
result = pd.concat([describe_df,df],axis=0)
CodePudding user response:
This will create a dataframe of the kind I believe your question is asking for:
import pandas as pd
import datetime
df = pd.DataFrame({
'date and time':[datetime.datetime.now() datetime.timedelta(minutes=i) for i in range(10)],
'PowerAC':[10.0 i / 5 for i in range(10)]
})
print(df)
df1 = df.set_index('date and time')
df2 = pd.concat([pd.DataFrame(index=pd.Index(['max', 'avg'], name=df1.index.names[0]), columns=df1.columns, data=[df1.max(), df1.mean()]), df1], axis=0).reset_index()
print(df2)
Sample input:
date and time PowerAC
0 2022-05-31 13:21:01.518330 10.0
1 2022-05-31 13:22:01.518330 10.2
2 2022-05-31 13:23:01.518330 10.4
3 2022-05-31 13:24:01.518330 10.6
4 2022-05-31 13:25:01.518330 10.8
5 2022-05-31 13:26:01.518330 11.0
6 2022-05-31 13:27:01.518330 11.2
7 2022-05-31 13:28:01.518330 11.4
8 2022-05-31 13:29:01.518330 11.6
9 2022-05-31 13:30:01.518330 11.8
Sample output:
date and time PowerAC
0 max 11.8
1 avg 10.9
2 2022-05-31 13:21:01.518330 10.0
3 2022-05-31 13:22:01.518330 10.2
4 2022-05-31 13:23:01.518330 10.4
5 2022-05-31 13:24:01.518330 10.6
6 2022-05-31 13:25:01.518330 10.8
7 2022-05-31 13:26:01.518330 11.0
8 2022-05-31 13:27:01.518330 11.2
9 2022-05-31 13:28:01.518330 11.4
10 2022-05-31 13:29:01.518330 11.6
11 2022-05-31 13:30:01.518330 11.8
If you want to use the entire df.describe()
output in this way (not just max and mean), you can do this:
df1 = df.set_index('date and time')
desc = df.describe()
desc.index.names = df1.index.names
df2 = pd.concat([desc, df1], axis=0).reset_index()
print(df2)
Sample output:
date and time PowerAC
0 count 10.00000
1 mean 10.90000
2 std 0.60553
3 min 10.00000
4 25% 10.45000
5 50% 10.90000
6 75% 11.35000
7 max 11.80000
8 2022-05-31 13:26:52.881477 10.00000
9 2022-05-31 13:27:52.881477 10.20000
10 2022-05-31 13:28:52.881477 10.40000
11 2022-05-31 13:29:52.881477 10.60000
12 2022-05-31 13:30:52.881477 10.80000
13 2022-05-31 13:31:52.881477 11.00000
14 2022-05-31 13:32:52.881477 11.20000
15 2022-05-31 13:33:52.881477 11.40000
16 2022-05-31 13:34:52.881477 11.60000
17 2022-05-31 13:35:52.881477 11.80000
As others have indicated, there are also alternative ways to store multiple dataframes in Excel that don't require combining your tabular data into a single dataframe which may be preferable.