I've the following df:
timestamp min max count
0 2022-10-23 22:00:00 00:00 1.000000e 09 9.000000e 99 0
1 2022-10-23 22:00:00 00:00 1.000000e 08 1.000000e 09 2
2 2022-10-23 22:00:00 00:00 1.000000e 07 1.000000e 08 39
3 2022-10-23 22:00:00 00:00 1.000000e 06 1.000000e 07 162
4 2022-10-23 22:00:00 00:00 1.000000e 05 1.000000e 06 491
5 2022-10-23 22:00:00 00:00 1.000000e 04 1.000000e 05 960
6 2022-10-23 22:00:00 00:00 1.000000e 03 1.000000e 04 287
7 2022-10-23 22:00:00 00:00 1.000000e 02 1.000000e 03 244
8 2022-10-23 22:00:00 00:00 1.000000e 01 1.000000e 02 416
9 2022-10-23 22:00:00 00:00 0.000000e 00 1.000000e 01 1
And I'm trying to group it by timestamp to create the following structure:
items = {
'2022-10-23 22:00:00 00:00':
[
{'min': 1.000000e 09, 'max': 9.000000e 99, 'count': 0},
{},
...
]
}
I think that there's a way to do it doing something similar to:
df.groupby('timestamp')[['max', 'min','count']].apply(lambda g: g.values.tolist()).to_dict()
But I don't know what to change in the lambda function in order to get the result I need.
CodePudding user response:
You can use a dictionary comprehension with to_dict
:
items = {k: g.to_dict('records') for k,g in
df.set_index('timestamp').groupby('timestamp')}
output:
{'2022-10-23 22:00:00 00:00': [{'min': 1000000000.0, 'max': 9e 99, 'count': 0},
{'min': 100000000.0, 'max': 1000000000.0, 'count': 2},
{'min': 10000000.0, 'max': 100000000.0, 'count': 39},
{'min': 1000000.0, 'max': 10000000.0, 'count': 162},
{'min': 100000.0, 'max': 1000000.0, 'count': 491},
{'min': 10000.0, 'max': 100000.0, 'count': 960},
{'min': 1000.0, 'max': 10000.0, 'count': 287},
{'min': 100.0, 'max': 1000.0, 'count': 244},
{'min': 10.0, 'max': 100.0, 'count': 416},
{'min': 0.0, 'max': 10.0, 'count': 1}]
}