I use the requests
module to hit an endpoint and get back a JSON response:
response = requests.get(endpoint, auth=(user_name, api_key)).json()
The response looks like this:
{'SalesInfoStatus': {'next_offset': 15,
'sales': [{'saSalesID': '761S-1666031549-adhoc-0:0',
'identifier': '761',
'origin': 'New York',
'destination': 'London',
'filed_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/19/2022',
'localtime': 1666018949},
'actual_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/19/2022',
'localtime': 1666018949},
'inbound_saSalesID': '761S-1666029229-adhoc-0:0'},
{'saSalesID': '762S-1666031549-adhoc-0:0',
'identifier': '762',
'origin': 'New York',
'destination': 'London',
'filed_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/17/2022',
'localtime': 1666018949},
'actual_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/17/2022',
'localtime': 1666018949},
'inbound_saSalesID': '762S-1666029229-adhoc-0:0'},
{'saSalesID': '765S-1666031549-adhoc-0:0',
'identifier': '765',
'origin': 'Paris',
'destination': 'Tokyo',
'filed_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/15/2022',
'localtime': 1666018949},
'actual_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/15/2022',
'localtime': 1666018949},
'inbound_saSalesID': '765S-1666029229-adhoc-0:0'},
{'saSalesID': '767S-1666031549-adhoc-0:0',
'identifier': '767',
'origin': 'Los Angeles',
'destination': 'Sydney',
'filed_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/13/2022',
'localtime': 1666018949},
'actual_time': {'epoch': 1666033349,
'time': '15:02',
'date': '10/13/2022',
'localtime': 1666018949},
'inbound_saSalesID': '767S-1666029229-adhoc-0:0'}
]}}
When I call type(response)
, I get dict
.
I would like to build a Pandas dataframe that contains the date
, origin
, destination
, and identifier
fields from this JSON.
The resulting dataframe would look like this:
date origin destination identifier
10/19/22 New York London 761
10/17/22 New York London 762
10/15/22 Paris Tokyo 765
10/13/22 Los Angeles Sydney 767
I'm using the following:
df = pd.json_normalize(response)
But, this returns a dataframe with a column containing what appears to be nested JSON:
SalesInfoStatus
sales [{'saSalesID': '761S-1666031549-adhoc-0:0',...
next_offset 15
How would I split the date
, origin
, destination
and identifier
fields out into their own columns in this dataframe?
Thanks!
CodePudding user response:
You can use pd.json_normalize
on the list value of sales
key
df = pd.json_normalize(response['SalesInfoStatus']['sales'])[['filed_time.date', 'origin', 'destination', 'identifier']]
print(df)
filed_time.date origin destination identifier
0 10/19/2022 New York London 761
1 10/17/2022 New York London 762
2 10/15/2022 Paris Tokyo 765
3 10/13/2022 Los Angeles Sydney 767