I’m trying to create a pandas dataframe from a json file that I have of my Apple Health Data.
My json file looks like this:
{
"data": {
"workouts": [],
"metrics": [
{
"name": "active_energy",
"units": "kcal",
"data": [
{
"qty": 213.881,
"date": "2022-04-12 00:00:00 -0600"
}
]
},
{
"name": "apple_exercise_time",
"units": "min",
"data": [
{
"date": "2022-04-12 00:00:00 -0600",
"qty": 6
}
]
},
{
"name": "sleep_analysis",
"units": "min",
"data": []
}
]
}
}
In this data, there is an empty array called workouts
and another called metrics
. I want to take the metrics
array from this file and turn it into a pandas dataframe like this:
date | name | qty | units |
---|---|---|---|
2022-04-12 | active_energy | 213.881 | kcal |
2022-04-12 | apple_excersise_time | 6 | min |
CodePudding user response:
Here's one way using a DataFrame constructor, explode
and join
:
tmp = pd.DataFrame(my_data['data']['metrics']).explode('data')
s = tmp['data'].dropna()
out = tmp.drop(columns='data').join(pd.DataFrame(s.tolist(), index=s.index))
Output:
name units qty date
0 active_energy kcal 213.881 2022-04-12 00:00:00 -0600
1 apple_exercise_time min 6.000 2022-04-12 00:00:00 -0600
2 sleep_analysis min NaN NaN
CodePudding user response:
Someone on SO shared this with me around a month ago. Sharing with you now.
import pandas as pd
df = pd.read_json("https://www.chsli.org/sites/default/files/transparency/111888924_GoodSamaritanHospitalMedicalCenter_standardcharges.json", lines=True)
print(df.head())
df.to_csv(r'C:\\your_path_here\\chsli.csv')
Result: