I am trying here to use json_normalize to somehow format the output of an API, but I keep getting a faulty and empty csv file. I tried to change df2 = pd.json_normalize(response, record_path=['LIST'])
, but keep getting this error message:
TypeError: byte indices must be integers or slices, not str
Could you please guide me on what am I doing wrong ?
Thanks a lot !
import requests
import json
import pandas as pd
url = "https://*hidden*Results/"
payload = json.dumps({
"id": 12345
})
headers = {
'Authorization': 'Basic *hidden*',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
df1 = pd.DataFrame(response).iloc[:,:-2]
df2 = pd.json_normalize(response, record_path=None)
df = pd.concat([df1, df2], axis=1)
df.to_csv("test.csv", index=False)
CodePudding user response:
You are passing the variable response
in the call:
df2 = pd.json_normalize(response, record_path=None)
Which is an a requests.models.Response
Object and you need to pass a dict
, so you need to do something like pd.json_normalize(response.json(), record_path=['LIST'])
I tried it with this example and works:
>>> import pandas as pd
>>> data = [
... {
... "state": "Florida",
... "shortname": "FL",
... "info": {"governor": "Rick Scott"},
... "counties": [
... {"name": "Dade", "population": 12345},
... {"name": "Broward", "population": 40000},
... {"name": "Palm Beach", "population": 60000},
... ],
... },
... {
... "state": "Ohio",
... "shortname": "OH",
... "info": {"governor": "John Kasich"},
... "counties": [
... {"name": "Summit", "population": 1234},
... {"name": "Cuyahoga", "population": 1337},
... ],
... },
... ]
>>> result = pd.json_normalize(data, ["counties"])
>>> result
name population
0 Dade 12345
1 Broward 40000
2 Palm Beach 60000
3 Summit 1234
4 Cuyahoga 1337
EDIT I will try to do this:
import requests
import json
import pandas as pd
url = "https://*hidden*Results/"
payload = json.dumps({
"id": 12345
})
headers = {
'Authorization': 'Basic *hidden*',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
json_response = response.json()
df1 = pd.DataFrame(json_response).iloc[:,:-2]
df2 = pd.json_normalize(json_response, record_path=['LIST'])
df = pd.concat([df1, df2], axis=1)
df.to_csv("test.csv", index=False)