Home > Software engineering >  How to check if element(s) exist in JSON array using jq, and put the corresponding object into a new
How to check if element(s) exist in JSON array using jq, and put the corresponding object into a new

Time:04-19

I am running curl commands on ~50 URL's and each have JSON that looks like this (but with different values for 'country' with each curl command, but the values for 'names' can possibly repeat or be unique:

e.g. one curl command can give JSON that looks like this:

{"names":["Mary","Tom","Sue","Rob"],"country":"USA"}

while the next curl command will give this:

{"names":["Sue"],"country":"Russia"}

and the next curl command will give this:

{"names":["Tom","Jenny"],"country":"Nigeria"}

and so on and so forth.

I have a separate list of names (e.g. Tom, Sarah, Jenny, Trinh, Nancy) and I want to find out if they're associated with a country in any of the JSON's I'm running the curl command on. If they exist in "names", I want to put the name of the person and the country into a new text file (or JSON file, doesn't matter - i just want it formatted properly), so at the end I have an output file that associates the name of the person and the country they belong to. If a country has multiple people, there shouldn't be a duplicate value for country in the output file; the names of the people should be listed under that one country.

I've tried multiple ways to solve this, but I'm not able to figure it out as it's my first time trying to write a script.

Last command that I tried:

curl "https://..." | jq -r 'select(.names[] as $a | ["Tom","Sarah","Jenny","Trinh","Nancy"] | index($a) | while read output; do tee -a listOfCountries; done; done

^This gave duplicates and I wasnt sure how to format the output so that there were spaces between each output and that the country had only the specific names of the people under it

The output file (given above example) should be something like:

USA: Tom

Nigeria: Tom, Jenny

Please let me know if you have any suggestions, it'll greatly be appreciated. Thank you!

Side question: If the list of names to search is extremely long (100 names), what is the best way to script this?

CodePudding user response:

With all your JSON objects in a file, say output.jsons:

jq -c -n --argjson list '[ "Tom", "Sarah", "Jenny", "Trinh", "Nancy"]' '
  (reduce inputs as $in ({}; reduce $in.names[] as $name (.; .[$name]  = [$in.country]))) as $dict
  | reduce $list[] as $name ({}; 
      if $dict[$name] 
      then reduce $dict[$name][] as $country (.; .[$country]  = [$name]) 
      else . end)
' output.jsons

produces:

{"USA":["Tom"],"Nigeria":["Tom","Jenny"]}

You can easily transform this into the desired output.

One way to ensure uniqueness of the elements of each array would be to append the following to the filter: map_values(unique).


Re the side question: instead of --argjson you could use --argfile or --slurpfile.

  • Related