I need to add data into the documents key within a json file with the structure below
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The final output should be something like
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://test/FFL.PDF",
"mimeType": "application/pdf"
},
{
"gcsUri": "gs://test/BGF.PDF",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
I have tried out to create a script using the code below, however I am getting a Keyerror while trying to add data and its not appending the data in the right format
# Python program to update
# JSON
import json
# function to add to JSON
def write_json(new_data, filename='keyvalue.json'):
with open(filename,'r ') as file:
# First we load existing data into a dict.
file_data = json.load(file)
# Join new_data with file_data inside emp_details
file_data["documents"].append(new_data)
# Sets file's current position at offset.
file.seek(0)
# convert back to json.
json.dump(file_data, file, indent = 4)
# python object to be appended
y = {
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
write_json(y)
CodePudding user response:
Your documents
are within the gcsDocuments
dict which is within the inputDocuments
dict (try print(file_data.keys()
) i.e change it to
file_data["inputDocuments"]["gcsDocuments"]["documents"].append(new_data)