Home > Mobile >  Unable to covert Json to Dataframe
Unable to covert Json to Dataframe

Time:11-10

Unable to covert Json to Dataframe, the following TypeError shows :

The following data is created

Test_data =

{'archived': False,
 'archived_at': None,
 'associations': None,
 'created_at': datetime.datetime(2020, 10, 30, 8, 3, 54, 190000, tzinfo=tzlocal()),
 'id': '12345',
 'properties': {'createdate': '[![2020-10-30T08:03:54.190Z][1]][1]',
                'email': '[email protected]',
                'firstname': 'TestFirst',
                'lastname': 'TestLast'},
 'properties_with_history': None,
 'updated_at': datetime.datetime(2022, 11, 10, 6, 44, 14, 5000, tzinfo=tzlocal())}

data = json.loads(test_data)

TypeError: the JSON object must be str, bytes or bytearray, not SimplePublicObjectWithAssociations

The following has been tried:

s1 = json.dumps(test_data)
d2 = json.loads(s1)

TypeError: Object of type SimplePublicObjectWithAssociations is not JSON serializable

Prefered Output :

Text

CodePudding user response:

can you try this:

df=pd.json_normalize(Test_data)
print(df)
'''
    archived    archived_at associations    created_at                  id   properties_with_history    updated_at  properties.createdate   properties.email    properties.firstname
0   False       None        None            2020-10-30T08:03:54.190Z    12345       2022-11-10T06:44:14.500Z    [![2020-10-30T08:03:54.190Z][1]][1] [email protected]  TestFirst

'''

if you want to specific columns:

df = df[['id','properties.createdate','properties.email','properties.firstname','properties.lastname']]
df.columns = df.columns.str.replace('properties.', '')
df

    id      createdate                              email              firstname    lastname
0   12345   [![2020-10-30T08:03:54.190Z][1]][1] [email protected]  TestFirst   TestLast

if you want convert createdate column to datetime:

import datefinder
df['createdate']=df['createdate'].apply(lambda x: list(datefinder.find_dates(x))[0])
df
    id      createdate                          email               firstname   lastname
0   12345   2020-10-30 08:03:54.190000 00:00    [email protected]  TestFirst   TestLast

CodePudding user response:

There is a partial solution.....Maybe selecting or doing an unpivot dataframe this approach could be useful...

import pandas as pd 
import datetime
import json
import jsonpickle

test_data ={'archived': False,
'archived_at': None,
'associations': None,
'created_at': datetime.datetime(2020, 10, 30, 8, 3, 54, 190000),
'id': '12345',
'properties': {'createdate': '[![2020-10-30T08:03:54.190Z][1]][1]',
            'email': '[email protected]',
            'firstname': 'TestFirst',
            'lastname': 'TestLast'},
'properties_with_history': None,
'updated_at': datetime.datetime(2022, 11, 10, 6, 44, 14, 5000)}


data = jsonpickle.encode(test_data, unpicklable=False)
pd.read_json(data)

enter image description here

I have tried with melt and unstack but I didn't reach your prefered ouput...

Hope It helps...

  • Related