Adding a Python dict to another dict, in a certain location-CodePudding

I have a .json file that contains objects that look like this:

[
  {
    "a": "0.038",
    "b": "1",
    "c": "0"
  },
  {
    "a": "0.040",
    "b": "1",
    "c": "0"
  },
  ...
]

And I have a .csv file that looks like this:

d   e   f
0.00    0.00    0.00
0.02    -0.08   -0.08
0.04    -0.32   -0.32
...

I would like to modify the .json file to add the new key/value pairs from the .csv file, but I'm not looking to just tack the new items at the end of the file, but rather a row from the .csv is added to the end of each element. So the new .json would look like this:

[
  {
    "a": "0.038",
    "b": "1",
    "c": "0",
    "d": "0.00",
    "e": "0.00",
    "f": "0.00"
  },
  {
    "a": "0.040",
    "b": "1",
    "c": "0",
    "d": "0.02",
    "e": "-0.08",
    "f": "-0.08""
  },
  ...
]

I have tried different ways (like using append() or update()), but they either add the .csv data to the complete end of the .json file or try to add the entire .csv data to the end of the first element in the .json. In my mind, what I need to do is, create a dictionary from the json object, load the csv data into a dictionary as well, loop through each json element and then add a row from the csv data into that, creating a new json object. But I'm not getting the output I want, or I run into an issue where I can't append to the dict. Here's my existing code:

import csv
import json

def csv_to_json(csvFilePath, jsonFilePath):

    with open("existing.json") as json_file:
        data = json.load(json_file)
    json_file.close() # Close the JSON file
    
    #read csv file
    with open(csvFilePath, encoding='utf-8') as csvf:
        #load csv file data using csv library's dictionary reader
        csvReader = csv.DictReader(csvf)
        
        for item in data:
            #convert each csv row into python dict
            for row in csvReader:
                #add this python dict to json array
                item.update(row)
                     
    #convert python jsonArray to JSON string and write to file
    with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
        jsonString = json.dumps(data, indent = 4)
        jsonf.write(jsonString)
        
#decide the 2 file paths according to your file system
csvFilePath = r'C:\Users\\Downloads\filename.csv'
jsonFilePath = r'C:\Users\\Desktop\filename.json'

csv_to_json(csvFilePath, jsonFilePath)

CodePudding user response：

Your issue is that you have nested loops. This creates a couple of problems:

for item in data:
    #convert each csv row into python dict
    for row in csvReader:
        #add this python dict to json array
        item.update(row)

csvReader = csv.DictReader(csvf) makes csvReader an iterator. Once you read through all lines of csvReader at the end of the first iteration of for item in data, that iterator is exhausted and won't yield any more rows. So for subsequent items in data, you'll never have any rows in csvReader. You could fix this by doing csvReader = list(csv.DictReader(...)), which will read all rows into a list over which you can iterate any number of times.
Once you fix this. you update every item in data with all rows of the csvReader. That's not what you want, since you only want the ith item to be updated with the ith row.

To fix this, you need to zip() the two iterators so that you iterate over data and csvReader concurrently:

for item, row in zip(data, csvReader):
    item.update(row)

Note that since you aren't iterating over csvReader multiple times in this case, it doesn't need to be a list and your original definition csvReader = csv.DictReader(csvf) is sufficient (you probably need csvReader = csv.DictReader(csvf, delimiter="\t") to specify the delimiter, it's assumed to be a comma by default)

And now you have data =

[{'a': '0.038', 'b': '1', 'c': '0', 'd': '0.00', 'e': '0.00', 'f': '0.00'},
 {'a': '0.040', 'b': '1', 'c': '0', 'd': '0.02', 'e': '-0.08', 'f': '-0.08'}]

CodePudding user response：

You need to iterate over the rows of the json array and the csv array together. The easiest way to do this is to zip() them together and loop over. This will give you corresponding rows so you can update the json row with the dict from the csv:

def csv_to_json(csvFilePath, jsonFilePath):

    with open("existing.json") as json_file:
        data = json.load(json_file)
    
    with open(csvFilePath, encoding='utf-8') as csvf:
        csvReader = csv.DictReader(csvf)
        
        for row_json, row_csv in zip(data, csvReader):
            row_json.update(row_csv)      
                     
    with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
        jsonString = json.dumps(data, indent = 4)
        jsonf.write(jsonString)

With your data, this will write a file that looks like:

[{
    "a": "0.038",
    "b": "1",
    "c": "0",
    "d": "0.00",
    "e": "0.00",
    "f": "0.00"
},
{
    "a": "0.040",
    "b": "1",
    "c": "0",
    "d": "0.02",
    "e": "-0.08",
    "f": "-0.08"
}]

CodePudding user response：

Use zip to zip the two lists together and combine the dictionaries:

>>> json_data = [
...   {
...     "a": "0.038",
...     "b": "1",
...     "c": "0"
...   },
...   {
...     "a": "0.040",
...     "b": "1",
...     "c": "0"
...   },
... ]
>>>
>>> csv_data = [
...     {
...         "d": "0.00",
...         "e": "0.00",
...         "f": "0.00"
...     },
...     {
...         "d": "0.02",
...         "e": "-0.08",
...         "f": "-0.08"
...     }
... ]
>>> from pprint import pprint
>>> pprint([j | c for j, c in zip(json_data, csv_data)])
[{'a': '0.038', 'b': '1', 'c': '0', 'd': '0.00', 'e': '0.00', 'f': '0.00'},
 {'a': '0.040', 'b': '1', 'c': '0', 'd': '0.02', 'e': '-0.08', 'f': '-0.08'}]

CodePudding user response：

Merging 2 dicts in a single expression here. Provided both lists of dicts are of same length:

list1 = [
  {
    "a": "0.038",
    "b": "1",
    "c": "0"
  },
  {
    "a": "0.040",
    "b": "1",
    "c": "0"
  }
]
list2 = [
  {
    "d": "0.00",
    "e": "0.00",
    "f": "0.00"
  },
  {
    "d": "0.02",
    "e": "-0.08",
    "f": "-0.08"
  }]

then iterating over the length of any of the lists:

>>> result = [{**list1[i], **list2[i]} for i in range(len(list1))]
>>> result
[{'a': '0.038', 'b': '1', 'c': '0', 'd': '0.00', 'e': '0.00', 'f': '0.00'}, {'a': '0.040', 'b': '1', 'c': '0', 'd': '0.02', 'e': '-0.08', 'f': '-0.08'}]