Home > other >  Flatten JSON response and output to csv
Flatten JSON response and output to csv

Time:11-17

I appear to have exhausted the internet searching for what feels like a common occurrence, and I need some help, please.

I'm making an API call using the requests library, which returns one JSON response per call - I'm going to loop through and make multiple calls.

I want to combine all of the responses from the many API calls into one python data structure and then export the results to CSV.

One API response looks like this:

{
    "status": "1",
    "msg": "Success",
    "data": {
      "id": "12345",
      "PriceDetail": [
        {
          "item": "Apple",
          "amount": "10",
          "weight": "225",
          "price": "92",
          "bestbeforeendeate": "30/09/2023"
        }
        ]
    }
}

My final output should be a CSV file with the following headers and data in the subsequent rows:

id item amount weight price bestbeforeendeate
12345 apple 10 225 92 30/09/2023
..... ..... .. ... .. ..........

I've looked at combining the responses in a dictionary, named tuple, dataframe and tried the various options to export to from said structures like dictwriter, csvwriter, normalize etc. Still, I'm struggling to make any of it work.

The closest I got was (I saved the results to a JSON file to stop hitting the API):

with open('item.json') as json_file: 
    data_set = json.load(json_file) 
    for data in data_set: 
        if data['msg'] == 'Success': 
            id = data['data']['id'] 
            return_data[id] = data['data']['PriceDetail'] 

df = pd.json_normalize(data['data']['PriceDetail']) 
print(df) 

I couldn't get the id added to the dataframe

Any suggestions would be appreciated.

Thanks,

CodePudding user response:

Pandas has a function called json_normalize, which can directly convert a dict into a dataframe. In order to convert a JSON string into a dict you can simply use the json library. Good source I found would be this`.

import json
import pandas as pd

# Test string, assuming it is from API
test_string = """{
    "status": "1",
    "msg": "Success",
    "data": {
      "id": "12345",
      "PriceDetail": [
        {
          "item": "Apple",
          "amount": "10",
          "weight": "225",
          "price": "92",
          "bestbeforeendeate": "30/09/2023"
        }
        ]
    }
}"""

# Function converts the api result to the dataframe and appends it to df
def add_new_entry_to_dataframe(df, api_result_string):
    input_parsed = json.loads(api_result_string)
    df_with_new_data = pd.json_normalize(input_parsed['data']['PriceDetail'])
    df = df.append(df_with_new_data)
    return df
    

# The dataframe you want to store everything
df = pd.DataFrame()

## Loop where you fetch new data
for i in range(10):
    newly_fetched_result = test_string
    df = add_new_entry_to_dataframe(df, newly_fetched_result)


df = df.reset_index()

# Save as .csv
df.to_csv('output.csv')

print(df)

The output of above code:

item amount weight price bestbeforeendeate
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
0  Apple     10    225    92        30/09/2023
  • Related