I have a dataframe
df1 = pd.DataFrame(data={'col1': [21, 44, 28, 32, 20, 39, 42],
'col2': ['<1', '>2', '>2', '>3', '<1', '>2', '>4'],
'col3': ['yes', 'yes', 'no', 'no', 'yes', 'no', 'yes'],
'col4': [1, 1, 0, 0, 1, 0, 1],
'Group': [0, 2, 1, 1, 0, 1, 2] })
col1 col2 col3 col4 Group
0 21 <1 yes 1 0
1 44 >2 yes 1 2
2 28 >2 no 0 1
3 32 >3 no 0 1
4 20 <1 yes 1 0
5 39 >2 no 0 1
6 42 >4 yes 1 2
Based on 'Group' column values 0,1,2 I need to create a nested dictionary like
{0: {'col1': {'mean': 20.5, 'sd': 0.5},
'col2': ['<1'],
'col3': ['yes'],
'col4': {'mean': 1, 'sd': 0}},
1: {'col1': {'mean': 33, 'sd': 4.54},
'col2': ['>2', '>3'],
'col3': ['no'],
'col4': {'mean': 0, 'sd': 0}},
2: {'col1': {'mean': 43, 'sd': 1},
'col2': ['>2', '>4'],
'col3': ['yes'],
'col4': {'mean': 1, 'sd': 0}}}
col1 and col4 are numeric, col2 and col3 are string
For Group 0:
'col1' : mean and stddev of values 20,21
'col2' : which has only '<1'
'col3' : has only 'yes'
'col4' : mean and stddev of values 1,1
For Group 1:
'col1' : mean and stddev of values 28,32,39
'col2' : which has '>2' and '>3'
'col3' : has only 'no'
'col4' : mean and stddev of values 0,0,0
For Group 2:
'col1' : mean and stddev of values 42,44
'col2' : which has '>2' and '>4'
'col3' : has only 'yes'
'col4' : mean and stddev of values 1,1
CodePudding user response:
df1.groupby('Group').agg(lambda x: x.unique().tolist() if x.dtype =='object'
else {'mean':x.mean(), 'std':x.std(ddof=0)}).T.to_dict('index')
Out[396]:
{0: {'col1': {'mean': 20.5, 'std': 0.5},
'col2': ['<1'],
'col3': ['yes'],
'col4': {'mean': 1.0, 'std': 0.0}},
1: {'col1': {'mean': 33.0, 'std': 4.546060565661952},
'col2': ['>2', '>3'],
'col3': ['no'],
'col4': {'mean': 0.0, 'std': 0.0}},
2: {'col1': {'mean': 43.0, 'std': 1.0},
'col2': ['>2', '>4'],
'col3': ['yes'],
'col4': {'mean': 1.0, 'std': 0.0}}}