Home > Mobile >  How to make a dataframe from a nested JSON and pandas?
How to make a dataframe from a nested JSON and pandas?

Time:04-18

Here is a sample of a json file that I have of data from my Apple Health app. How can I take this json and turn it into a pandas dataframe? The dataframe structure I am hoping to make is shown below.

{
  "data": {
    "workouts": [],
    "metrics": [
      {
        "name": "sleep_analysis",
        "units": "hr",
        "data": [
          {
            "sleepSource": "Phone",
            "sleepStart": "2021-02-24 00:00:00 -0700",
            "asleep": 0,
            "inBed": 7.555225084722042,
            "sleepEnd": "2021-02-24 07:33:18 -0700",
            "date": "2021-02-24 00:00:00 -0700",
            "inBedStart": "2021-02-24 00:00:00 -0700",
            "inBedSource": "Phone",
            "inBedEnd": "2021-02-24 07:33:18 -0700"
          },
          {
            "sleepStart": "2021-02-25 01:17:34 -0700",
            "date": "2021-02-25 07:37:19 -0700",
            "sleepEnd": "2021-02-25 07:38:25 -0700",
            "inBedSource": "Phone",
            "inBedStart": "2021-02-25 01:17:34 -0700",
            "sleepSource": "Phone",
            "asleep": 0,
            "inBed": 6.1780015191766955,
            "inBedEnd": "2021-02-25 07:38:25 -0700"
          },
          {
            "inBedStart": "2021-02-26 00:00:00 -0700",
            "sleepEnd": "2021-02-26 08:35:05 -0700",
            "sleepSource": "Oura Ring",
            "inBedSource": "Oura Ring",
            "inBedEnd": "2021-02-26 08:35:05 -0700",
            "date": "2021-02-26 00:00:00 -0700",
            "inBed": 8.58497479862637,
            "sleepStart": "2021-02-26 00:00:00 -0700",
            "asleep": 8.1
          }
        ]
      },
      {
        "units": "kcal",
        "name": "active_energy",
        "data": [
          {
            "qty": 960.4964841592565,
            "date": "2021-01-01 00:00:00 -0700"
          },
          {
            "qty": 738.1875158407441,
            "date": "2021-01-02 00:00:00 -0700"
          }
        ]
      }
    ]
  }
}
name units sleepSource sleepStart asleep inBed sleepEnd date inBedStart inBedSource inBedEnd qty
sleep_analysis hr Phone 2021-02-24 00:00:00 -0700 0 7.555225085 2021-02-24 07:33:18 -0700 2021-02-24 00:00:00 -0700 2021-02-24 00:00:00 -0700 Phone 2021-02-24 07:33:18 -0700
sleep_analysis hr Phone 2021-02-25 01:17:34 -0700 0 6.178001519 2021-02-25 07:38:25 -0700 2021-02-25 07:37:19 -0700 2021-02-25 01:17:34 -0700 Phone 2021-02-25 07:38:25 -0700
sleep_analysis hr Oura Ring 2021-02-26 00:00:00 -0700 8.1 8.584974799 2021-02-26 08:35:05 -0700 2021-02-26 00:00:00 -0700 2021-02-26 00:00:00 -0700 Oura Ring 2021-02-26 08:35:05 -0700
active_energy kcal 2021-01-01 00:00:00 -0700 960.4964842
active_energy kcal 2021-01-02 00:00:00 -0700 738.1875158

CodePudding user response:

Panda has a helpful function method for this

metrics = data['data']['metrics']
dataframe = pd.json_normalize(
  metrics,
  record_path=['data'],
  meta=['name','units']
)

You can read more about converting json to dataframes here https://towardsdatascience.com/how-to-convert-json-into-a-pandas-dataframe-100b2ae1e0d8

  • Related