Using Python with pandas to export data from a database to csv.Data looks like this when exported. Got like 100 logs/day so this is pure for visualising purpose:
time | Buf1 | Buf2 |
---|---|---|
12/12/2022 19:15:56 | 12 | 3 |
12/12/2022 18:00:30 | 5 | 18 |
11/12/2022 15:15:08 | 12 | 3 |
11/12/2022 15:15:08 | 10 | 9 |
Now i only show the "raw" data into a csv but i am in need to generate for each day a min. max. and avg value. Whats the best way to create that ? i've been trying to do some min() max() functions but the problem here is that i've multiple days in these csv files. Also trying to manupilate the data in python it self but kinda worried about that i'll be missing some and the data will be not correct any more.
I would like to end up with something like this:
time | buf1_max | buf_min |
---|---|---|
12/12/2022 | 12 | 3 |
12/12/2022 | 12 | 10 |
CodePudding user response:
Here you go, step by step.
In [27]: df['time'] = df['time'].astype("datetime64").dt.date
In [28]: df
Out[28]:
time Buf1 Buf2
0 2022-12-12 12 3
1 2022-12-12 5 18
2 2022-11-12 12 3
3 2022-11-12 10 9
In [29]: df = df.set_index("time")
In [30]: df
Out[30]:
Buf1 Buf2
time
2022-12-12 12 3
2022-12-12 5 18
2022-11-12 12 3
2022-11-12 10 9
In [31]: df.groupby(df.index).agg(['min', 'max', 'mean'])
Out[31]:
Buf1 Buf2
min max mean min max mean
time
2022-11-12 10 12 11.0 3 9 6.0
2022-12-12 5 12 8.5 3 18 10.5
CodePudding user response:
Another approach is to use pivot_table
for simplification of grouping data (keep in mind to convert 'time' column to datetime64
format as suggested:
import pandas as pd
import numpy as np
df.pivot_table(
index='time',
values=['Buf1', 'Buf2'],
aggfunc={'Buf1':[min, max, np.mean], 'Buf2':[min, max, np.mean]}
)
You can add any aggfunc as you wish.