Python pandas Get daily: MIN MAX AVG results of datasets-CodePudding

Using Python with pandas to export data from a database to csv.Data looks like this when exported. Got like 100 logs/day so this is pure for visualising purpose:

time	Buf1	Buf2
12/12/2022 19:15:56	12	3
12/12/2022 18:00:30	5	18
11/12/2022 15:15:08	12	3
11/12/2022 15:15:08	10	9

Now i only show the "raw" data into a csv but i am in need to generate for each day a min. max. and avg value. Whats the best way to create that ? i've been trying to do some min() max() functions but the problem here is that i've multiple days in these csv files. Also trying to manupilate the data in python it self but kinda worried about that i'll be missing some and the data will be not correct any more.

I would like to end up with something like this:

time	buf1_max	buf_min
12/12/2022	12	3
12/12/2022	12	10

CodePudding user response：

Here you go, step by step.



In [27]: df['time'] = df['time'].astype("datetime64").dt.date

In [28]: df
Out[28]:
         time  Buf1  Buf2
0  2022-12-12    12     3
1  2022-12-12     5    18
2  2022-11-12    12     3
3  2022-11-12    10     9

In [29]: df = df.set_index("time")

In [30]: df
Out[30]:
            Buf1  Buf2
time
2022-12-12    12     3
2022-12-12     5    18
2022-11-12    12     3
2022-11-12    10     9

In [31]: df.groupby(df.index).agg(['min', 'max', 'mean'])
Out[31]:
           Buf1           Buf2
            min max  mean  min max  mean
time
2022-11-12   10  12  11.0    3   9   6.0
2022-12-12    5  12   8.5    3  18  10.5

CodePudding user response：

Another approach is to use pivot_table for simplification of grouping data (keep in mind to convert 'time' column to datetime64 format as suggested:

import pandas as pd
import numpy as np

df.pivot_table(
    index='time', 
    values=['Buf1', 'Buf2'], 
    aggfunc={'Buf1':[min, max, np.mean], 'Buf2':[min, max, np.mean]}
)

You can add any aggfunc as you wish.