Home > OS >  Json file to pandas data frame
Json file to pandas data frame

Time:07-15

I have a JSON file look like below.

myjson= {'data': [{'ID': 'da45e00ca',
   'name': 'June_2016',
   'objCode': 'ased',
   'percentComplete': 4.17,
   'plannedCompletionDate': '2021-04-29T10:00:00:000-0500',
   'plannedStartDate': '2020-04-16T23:00:00:000-0500',
   'priority': 4,
   'asedectedCompletionDate': '2022-02-09T10:00:00:000-0600',
   'status': 'weds'},
  {'ID': '10041ce23c',
   'name': '2017_Always',
   'objCode': 'ased',
   'percentComplete': 4.17,
   'plannedCompletionDate': '2021-10-22T10:00:00:000-0600',
   'plannedStartDate': '2021-08-09T23:00:00:000-0600',
   'priority': 3,
   'asedectedCompletionDate': '2023-12-30T11:05:00:000-0600',
   'status': 'weds'},
   {'ID': '10041ce23ca',
   'name': '2017_Always',
   'objCode': 'ased',
   'percentComplete': 4.17,
   'plannedCompletionDate': '2021-10-22T10:00:00:000-0600',
   'plannedStartDate': '2021-08-09T23:00:00:000-0600',
   'priority': 3,
   'asedectedCompletionDate': '2023-12-30T11:05:00:000-0600',
   'status': 'weds'}]}

I was trying to normalize it convert it to pandas DF using the below code but doesn't seem to come correct

from pandas.io.json import json_normalize 
reff = json_normalize(myjson)
df = pd.DataFrame(data=reff)
df

Can someone have any idea what I'm doing wrong? Thanks in advance!

CodePudding user response:

Try:

import pandas as pd 
reff = pd.json_normalize(myjson['data'])
df = pd.DataFrame(data=reff)
df

You forgot to pull your data out of myjson. json_normalize() will iterate through the most outer-layer of your JSON.

CodePudding user response:

This method first normalizes the json data and then converts it into the pandas dataframe. You would have to import this method from the pandas module.

Step 1 - Load the json data

json.loads(json_string)

Step 2 - Pass the loaded data into json_normalize() method

json_normalize(json.loads(json_string))

Example:

import pandas as pd
import json
# Create json string
# with student details
json_string = '''
[
    { "id": "1", "name": "sravan","age":22 },
    { "id": "2", "name": "harsha","age":22 },
    { "id": "3", "name": "deepika","age":21 },
    { "id": "4", "name": "jyothika","age":23 }
]
'''
# Load json data and convert to Dataframe 
df = pd.json_normalize(json.loads(json_string)) 
# Display the Dataframe
print(df)

Output:

  id      name  age
0  1    sravan   22
1  2    harsha   22
2  3   deepika   21
3  4  jyothika   23
  • Related