Home > Software engineering >  Update a json file using csv file contents
Update a json file using csv file contents

Time:01-19

I've 2 files: changes.csv and sample.json. My csv file is like following:

header "a", "b"
      "a11","b1"
      "a22","b2"
      "a33","b3"

and the json file is like:

[
 {"a":"a1","b":"b1"},
 {"a":"a2","b":"b2"},
 {"a":"a3","b":"b3"}
]

I need to write a jq command, which make changes in json file using the csv file, i.e., the final output of the json file should be like following:

[
 {"a":"a11","b":"b1"},
 {"a":"a22","b":"b2"},
 {"a":"a33","b":"b3"}
]

I wrote the following command:

while IFS=",", read f1 f2
do
  jq --argjson k1 $f1 --argjson k2 $f2 '(.[] | select(.b == $k2) | .a) |= $k1' sample.json| sponge sample.json
done < changes.csv

Although, for each iteration it is able to filter and update the value of key "a", but when I try to sponge the results into the json file, it is unable to do so. Don't know where exactly I am missing out.

CodePudding user response:

Assuming the CSV is reasonably well-behaved, you could write:

# Skip the CSV header row by NOT specifying the -n option
< changes.csv | jq -Rcr --argfile json sample.json '
  def trim: sub("^[ \t]*\""; "") | sub("\"[ \t]*$";"");
  INDEX(inputs | split(",") | map(trim) | select(length>0); .[1]) as $dict
  | $json
  | map( .a = $dict[.b][0] )
'

For more messy CSV, you will probably want to use a CSV-to-JSON or CSV-to-TSV tool (which can both quite easily be written in jq -- see e.g. https://rosettacode.org/wiki/Convert_CSV_records_to_TSV#jq)


If you prefer not to use the --argfile option, then by all means use some other method of reading the two files, e.g. you could use --rawfile for the CSV, leaving STDIN for the JSON.

CodePudding user response:

For a pure jq solution, you'll better make sure that your CSV doesn't contain any , or " or \n in any field.

For now I'll propose a solution with Miller (available here for several OSs), which can do the task robustly:

mlr --icsv --ojson --no-jvstack join --ijson -f file.json -j 'b' --ul file.csv
[
{"b": "b1", "a": "a11"},
{"b": "b2", "a": "a22"},
{"b": "b3", "a": "a33"}
]

Let's decompose the command:

  • mlr join -f file1 -j 'b' file2
    

    will join file1 and file2 on the field b. When an other field than b exists in both files (for ex. a) then it is the value of file2 that is outputted. So, for updating the values of the JSON with the ones of the CSV then file1 shall be the JSON and file2 the CSV.

  • --ul means to output the unjoinable lines of file1

  • With the join verb, Miller allows to specify a different file format for file1 (which is the JSON), so you need to set the default input format as --icsv and override it with --ijson after the join verb.

  • The output format is set to JSON with --ojson. --no-jvstack means to output the JSON records in a single line.

  • Related