I appear to have exhausted the internet searching for what feels like a common occurrence, and I need some help, please.
I'm making an API call using the requests library, which returns one JSON response per call - I'm going to loop through and make multiple calls.
I want to combine all of the responses from the many API calls into one python data structure and then export the results to CSV.
One API response looks like this:
{
"status": "1",
"msg": "Success",
"data": {
"id": "12345",
"PriceDetail": [
{
"item": "Apple",
"amount": "10",
"weight": "225",
"price": "92",
"bestbeforeendeate": "30/09/2023"
}
]
}
}
My final output should be a CSV file with the following headers and data in the subsequent rows:
id | item | amount | weight | price | bestbeforeendeate |
---|---|---|---|---|---|
12345 | apple | 10 | 225 | 92 | 30/09/2023 |
..... | ..... | .. | ... | .. | .......... |
I've looked at combining the responses in a dictionary, named tuple, dataframe and tried the various options to export to from said structures like dictwriter, csvwriter, normalize etc. Still, I'm struggling to make any of it work.
The closest I got was (I saved the results to a JSON file to stop hitting the API):
with open('item.json') as json_file:
data_set = json.load(json_file)
for data in data_set:
if data['msg'] == 'Success':
id = data['data']['id']
return_data[id] = data['data']['PriceDetail']
df = pd.json_normalize(data['data']['PriceDetail'])
print(df)
I couldn't get the id added to the dataframe
Any suggestions would be appreciated.
Thanks,
CodePudding user response:
Pandas has a function called json_normalize
, which can directly convert a dict into a dataframe. In order to convert a JSON string into a dict
you can simply use the json
library.
Good source I found would be this`.
import json
import pandas as pd
# Test string, assuming it is from API
test_string = """{
"status": "1",
"msg": "Success",
"data": {
"id": "12345",
"PriceDetail": [
{
"item": "Apple",
"amount": "10",
"weight": "225",
"price": "92",
"bestbeforeendeate": "30/09/2023"
}
]
}
}"""
# Function converts the api result to the dataframe and appends it to df
def add_new_entry_to_dataframe(df, api_result_string):
input_parsed = json.loads(api_result_string)
df_with_new_data = pd.json_normalize(input_parsed['data']['PriceDetail'])
df = df.append(df_with_new_data)
return df
# The dataframe you want to store everything
df = pd.DataFrame()
## Loop where you fetch new data
for i in range(10):
newly_fetched_result = test_string
df = add_new_entry_to_dataframe(df, newly_fetched_result)
df = df.reset_index()
# Save as .csv
df.to_csv('output.csv')
print(df)
The output of above code:
item amount weight price bestbeforeendeate
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023
0 Apple 10 225 92 30/09/2023