I have a pandas dataframe that looks in the following way
category sub_cat vitals value
HR EKG HR_EKG 136
SPO2 SPO2 SpO2_1 86
HR PPG HR_PPG_1 135
SPO2 PI PI_1 4.25
HR PPG HR_PULSED 135
NIBP SBP NIBPS 73
NIBP DBP NIBPD 25
NIBP MBP NIBPM 53
and I'd like to group by category and sub_cat columns and convert it into a list of nested dictionaries something like this
[{
"HR":
{
"EKG":
{
"HR_EKG": 136
},
"PPG":
{
"HR_PPG_1": 135,
"HR_PULSED": 135
}
}
},
{
"NIBP":
{
"SBP":
{
"NIBPS": 73
},
"DBP":
{
"NIBPD": 25
},
"MBP":
{
"NIBPM": 53
}
}
},
{
"SPO2":
{
"SPO2":
{
"SpO2_1": 86
},
"PI":
{
"PI_1": 4.25
}
}
}]
I was able to either group by (category, vitals, and values), or (subcategory, vitals, and values) but was not able to group by all 4 columns. This is what I tried and works for 3 columns
df = df.groupby(['sub_cat']).apply(lambda x: dict(zip(x['vitals'], x['value'])))
CodePudding user response:
A series of nested groupby
apply
to_dict
calls will do it:
dct = df.groupby('category').apply(
lambda category: category.groupby('sub_cat').apply(
lambda sub_cat: sub_cat.set_index('vitals')['value'].to_dict()
).to_dict()
).to_dict()
Output:
>>> import json
>>> print(json.dumps(dct, indent=4))
{
"HR": {
"EKG": {
"HR_EKG": 136.0
},
"PPG": {
"HR_PPG_1": 135.0,
"HR_PULSED": 135.0
}
},
"NIBP": {
"DBP": {
"NIBPD": 25.0
},
"MBP": {
"NIBPM": 53.0
},
"SBP": {
"NIBPS": 73.0
}
},
"SPO2": {
"PI": {
"PI_1": 4.25
},
"SPO2": {
"SpO2_1": 86.0
}
}
}