I am getting a nested JSON off of an API. When I try to convert the JSON into CSV, I get attribute errors for keys. How do I only write the records from the sample data to csv?
{"result":{"total_records":3471,"offset":0.0,"size":100.0,"records":[{"hr_case_number":"HRC0177303","requested_by":"Test Emp 1","employee_name":"Test Emp2","employee_id":"99991","artifact_type":"Competency Assessment","artifact_subtype":"Transfer Competency Assessment","artifact_date":"2022-10-30","status":"uploaded"},{"hr_case_number":"HRC0177302","requested_by":"test emp 3","employee_name":"Test Emp 4","employee_id":"192499","artifact_type":"Orientation","artifact_subtype":"Acknowledgement of Mandated Reporter Status","artifact_date":"2022-10-28","status":"uploaded"}]}}
This is what I have so far:
import requests, json
import csv
url = "https://some.com/api?offset=0&size=10000"
headers = {
'Authorization': 'Basic c3ZjX2RhdGF',
'Cookie': 'BIGipServerpool_sometest=a728;
JSESSIONID=26D6FA5703B691409AA3E44E6825C816; glide_user_route=glide.a50e06d87c4640335db5b2b40400f955; glide_session_store=D9F021AC1BE6D1103FB41F87B04BCB49',
'Content-Type':'application/json','Accept':'application/json'
}
payload={}
response = requests.request("GET", url, headers=headers, data=payload)
with open('outputfile.json', 'wb') as outf:
outf.write(response.content)
outf.close()
#Open JSON load the data into the variable data
with open('C:\Python\outputfile.json', 'r') as json_file:
data = json.load(json_file)
result = data['result']
# open a file for writing
data_file = open('C:\Python\outputfile.csv', 'w', newline='')
# create the csv writer
csv_writer = csv.writer(data_file)
# header
count = 0
for records in result:
if count == 0:
# Writing headers of CSV file
header = records.keys()
csv_writer.writerow(header)
count = 1
# Writing data of CSV file
csv_writer.writerow(records.values())
data_file.close()
CodePudding user response:
result
in your code is a dict
, you shuold loop a list
(result['records']) rather than dict
d = {"result":{"total_records":3471,"offset":0.0,"size":100.0,"records":[{"hr_case_number":"HRC0177303","requested_by":"Test Emp 1","employee_name":"Test Emp2","employee_id":"99991","artifact_type":"Competency Assessment","artifact_subtype":"Transfer Competency Assessment","artifact_date":"2022-10-30","status":"uploaded"},{"hr_case_number":"HRC0177302","requested_by":"test emp 3","employee_name":"Test Emp 4","employee_id":"192499","artifact_type":"Orientation","artifact_subtype":"Acknowledgement of Mandated Reporter Status","artifact_date":"2022-10-28","status":"uploaded"}]}}
with open('C:\Python\outputfile.csv', 'w') as f:
csv_writer = csv.writer(f)
for idx, record in enumerate(d['result']['records']):
if idx == 0:
csv_writer.writerow(record.keys())
csv_writer.writerow(record.values())
CodePudding user response:
records
is a dict
- the data you want is in data['result']['records']
. Lists don't have a keys()
method either. You could potentially use the first dict in that list, but then you'd also have to check whether there was any data at all.
Usually, when writing a CSV file, you know the columns in advance. It would be common to hard code the header which has the benefit that the writer will verify records for you (better to raise an error than write garbage data) and it will still work even if you have no records at all.
Use DictWriter
because your records are dictionaries and supply the predefined header. The relevant part is
my_header = ['artifact_date', 'artifact_subtype', 'artifact_type',
'employee_id', 'employee_name', 'hr_case_number', 'requested_by',
'status']
# open a file for writing
with open('C:\Python\outputfile.csv', 'w', newline='') as data_file:
writer = csv.DictWriter(data_file, my_header)
writer.writeheader()
writer.writerows(data['result']['records'])