Home > Enterprise >  Get all nested urls based on nested tag in JSON
Get all nested urls based on nested tag in JSON

Time:04-09

I have a following Json input in a text file json.txt:

{
   "files":[
      {
         "id":49894894,
         "list":[
            {
               "name":"one",
               "animal_potato_carrot":{
                  "options":[
                     {
                        "id":4989,
                        "url":"https://example.com/text.txt"
                     },
                     {
                        "id":3994,
                        "url":"https://example.com/randomfile.json"
                     }
                  ]
               }
            },
            {
               "name":"two",
               "cat_dog_rabbit":{
                  "options":[
                     {
                        "id":4989,
                        "url":"https://example.com/text2.txt"
                     },
                     {
                        "id":3994,
                        "url":"https://example.com/randomfile.json"
                     }
                  ]
               }
            },
            {
               "name":"three",
               "animal_potato_carrot":{
                  "options":[
                     {
                        "id":4989,
                        "url":"https://example.com/text3.txt"
                     },
                     {
                        "id":3994,
                        "url":"https://example.com/randomfile.json"
                     }
                  ]
               }
            }
         ]
      }
   ]
}

I want to get only the first url in the list of options for each animal_potato_carrot nested tag only (ignore the other ones like cat_dog_rabbit)

So my output will be two urls (the first one in each of those blocks for animal_potato_carrot):

["https://example.com/text.txt", "https://example.com/text3.txt"]

I tried jq json.txt -c '.. |."animal_potato_carrot"? | select(. != null)' but that returns all the things inside the body, not just the FIRST url.

CodePudding user response:

One way:

$ jq --arg id animal_potato_carrot '[ .files[] | .list[] | select(has($id)) | .[$id].options[0].url ]' input.json
[
  "https://example.com/text.txt",
  "https://example.com/text3.txt"
]

Iterate over all the relevant objects after extracting them from the lists, filter to just the ones with the key you want, and then extract the first url from those object's options list.

CodePudding user response:

Just go ahead like you have already started and add the missing filters:

jq -c '[..|.animal_potato_carrot?|select(. != null)|.options[0].url]' json.txt
  • Related