I am working on a script to read JSON from a file and then to extract a certain content to a different table. I am able to read from the file and parse the list of books out(in the example). The issue is since I am writing to a table, not every list has the same fields. I want to send default values in the JSON.So this is the JSON I have. I want to end up writing the JSON
{
"books": [
{
"field1": "value",
"field2": "value",
"field3": "value"
},
{
"field1": "value",
"field2": [
"value"
],
"field3": "value",
"field4": "value"
},
{
"field1": "value",
"field2": "value",
"field3": "value",
"field4": "value",
"field5": "value"
}
]
}
I am trying to write these values to a csv. I thought the next intermediate step might be to have a flattened structure like the following. So basically empty strings and then I could write this to an CSV. I am not clear how to do this if anyone has suggestions. Thanks!
[
{
"field1": "value",
"field2": "value",
"field3": "value",
"field4": "",
"field5": ""
},
{
"field1": "value",
"field2": [
"value"
],
"field3": "value",
"field4": "value",
"field5": ""
},
{
"field1": "value",
"field2": "value",
"field3": "value",
"field4": "value",
"field5": "value"
}
]
CodePudding user response:
Easier to do it using Pandas library. Sample code below
import numpy as np
import pandas as pd
d = {
"books": [
{
"field1": "value",
"field2": "value",
"field3": "value"
},
{
"field1": "value",
"field2": [
"value"
],
"field3": "value",
"field4": "value"
},
{
"field1": "value",
"field2": "value",
"field3": "value",
"field4": "value",
"field5": "value"
}
]
}
df = pd.DataFrame.from_dict(d['books'])
df.fillna("", inplace = True)
csv = df.to_csv(index = False)
print(csv)
CodePudding user response:
You can merge a default valued dictionary with your data dictionary, to generate default values. See an example approach below:
books: list = {
"books": [
{
"field1": "value",
"field2": "value",
"field3": "value"
},
{
"field1": "value",
"field2": [
"value"
],
"field3": "value",
"field4": "value"
},
{
"field1": "value",
"field2": "value",
"field3": "value",
"field4": "value",
"field5": "value"
}
]
}.get("books")
# find all keys and get a unique set
keys_list = []
[keys_list.extend(book.keys()) for book in books]
all_keys = set(keys_list)
# Generate a default dict
default = {key: "" for key in all_keys}
# Generate a list dicts with default merged values
flattened_list =[{**default, **book} for book in books]
#Write a CSV file
import csv
with open("books.csv", 'w') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames=all_keys)
writer.writeheader()
writer.writerows(flattened_list)