Home > Net >  How to aggregate rows into a json using pandas
How to aggregate rows into a json using pandas

Time:06-06

I have a dataframe as follows:

lat long city nameDisease numberCases
0 2 rio Dengue 1
0 2 rio Chicungunha 2
1 3 sp Dengue 3
1 3 sp COVID 4

I want to aggregate the rows with same (lat,long,city) and generate a json as follows:

[{lat:0,long:2,city:"rio",diseases:[{nameDisease:"Dengue",numberCases:1},{nameDisease:"Chicungunha",numberCases:2}],{lat:1,long:3,city:"sp",diseases:[{nameDisease:"Dengue",numberCases:3},{nameDisease:"COVID",numberCases:4}]]

How can I do this kind of transformation with pandas?

CodePudding user response:

A few to_dict groupby calls:

json = df.groupby(cols).apply(lambda g: g.drop(cols, axis=1).to_dict('records')).reset_index().rename({0:'diseases'}, axis=1).to_dict('records')

Output:

>>> json
[{'lat': 0,
  'long': 2,
  'city': 'rio',
  'diseases': [{'nameDisease': 'Dengue', 'numberCases': 1},
   {'nameDisease': 'Chicungunha', 'numberCases': 2}]},
 {'lat': 1,
  'long': 3,
  'city': 'sp',
  'diseases': [{'nameDisease': 'Dengue', 'numberCases': 3},
   {'nameDisease': 'COVID', 'numberCases': 4}]}]
   
>>> json == expected_output
True
  • Related