Unable to covert Json to Dataframe, the following TypeError shows :
The following data is created
Test_data =
{'archived': False,
'archived_at': None,
'associations': None,
'created_at': datetime.datetime(2020, 10, 30, 8, 3, 54, 190000, tzinfo=tzlocal()),
'id': '12345',
'properties': {'createdate': '[![2020-10-30T08:03:54.190Z][1]][1]',
'email': '[email protected]',
'firstname': 'TestFirst',
'lastname': 'TestLast'},
'properties_with_history': None,
'updated_at': datetime.datetime(2022, 11, 10, 6, 44, 14, 5000, tzinfo=tzlocal())}
data = json.loads(test_data)
TypeError: the JSON object must be str, bytes or bytearray, not SimplePublicObjectWithAssociations
The following has been tried:
s1 = json.dumps(test_data)
d2 = json.loads(s1)
TypeError: Object of type SimplePublicObjectWithAssociations is not JSON serializable
Prefered Output :
CodePudding user response:
can you try this:
df=pd.json_normalize(Test_data)
print(df)
'''
archived archived_at associations created_at id properties_with_history updated_at properties.createdate properties.email properties.firstname
0 False None None 2020-10-30T08:03:54.190Z 12345 2022-11-10T06:44:14.500Z [![2020-10-30T08:03:54.190Z][1]][1] [email protected] TestFirst
'''
if you want to specific columns:
df = df[['id','properties.createdate','properties.email','properties.firstname','properties.lastname']]
df.columns = df.columns.str.replace('properties.', '')
df
id createdate email firstname lastname
0 12345 [![2020-10-30T08:03:54.190Z][1]][1] [email protected] TestFirst TestLast
if you want convert createdate column to datetime:
import datefinder
df['createdate']=df['createdate'].apply(lambda x: list(datefinder.find_dates(x))[0])
df
id createdate email firstname lastname
0 12345 2020-10-30 08:03:54.190000 00:00 [email protected] TestFirst TestLast
CodePudding user response:
There is a partial solution.....Maybe selecting or doing an unpivot dataframe this approach could be useful...
import pandas as pd
import datetime
import json
import jsonpickle
test_data ={'archived': False,
'archived_at': None,
'associations': None,
'created_at': datetime.datetime(2020, 10, 30, 8, 3, 54, 190000),
'id': '12345',
'properties': {'createdate': '[![2020-10-30T08:03:54.190Z][1]][1]',
'email': '[email protected]',
'firstname': 'TestFirst',
'lastname': 'TestLast'},
'properties_with_history': None,
'updated_at': datetime.datetime(2022, 11, 10, 6, 44, 14, 5000)}
data = jsonpickle.encode(test_data, unpicklable=False)
pd.read_json(data)
I have tried with melt and unstack but I didn't reach your prefered ouput...
Hope It helps...