Home > Software engineering >  Reshape JSON File with JQ stripping commas from objects
Reshape JSON File with JQ stripping commas from objects

Time:11-24

I'm building a ROKU app and am reformatting a json file pulled from an API to be in the same format as the ROKU Direct Publisher feed.

This is what ROKU is expecting...

{
    "providerName": "Acme Productions",
    "lastUpdated": "2015-11-11T22:21:37 00:00",
    "language": "en",
    "categories": [
        ...
    ],
    "playlists": [
        ...
    ],
    "movies": [
        ...
    ],
    "liveFeeds": [
        ...
    ],
    "series": [
        ...
    ],
    "shortFormVideos":  [
        ...
    ],
    "tvSpecials": [
        ...
    ]
}

I'm using jq to reshape it and am having an issue.

My current json file is basically like this and goes on and on and on (i've stripped most of it out as the keys don't really matter for what I'm asking...)

{
"page_info":{
    "total_results":1000,
    "results_per_page":50
},
"results":[
    {
        "category":"B-Roll",
        "aspect_ratio":"16:9",
        "duration":1851,
        "hd":true,
        "title":"Title",
        "id":"video:822667",
        "type":"video",
        "keywords":"removed",
        "credit":"Removed",
        "country":"United States",
        "city":"",
        "hls_url":"file"
    },
    {
        "category":"B-Roll",
        "aspect_ratio":"16:9",
        "duration":1851,
        "hd":true,
        "title":"Title",
        "id":"video:822667",
        "type":"video",
        "keywords":"removed",
        "credit":"Removed",
        "country":"United States",
        "city":"",
        "hls_url":"file"
    },
    {
         "category":"B-Roll",
        "aspect_ratio":"16:9",
        "duration":1851,
        "hd":true,
        "title":"Title",
        "id":"video:822667",
        "type":"video",
        "keywords":"removed",
        "credit":"Removed",
        "country":"United States",
        "city":"",
        "hls_url":"file"
    }
]}

This is my jq filter - .results[] | {"providerName":"CrozTest" } {"language": "en-us"} {"lastUpdated": .timestamp} {"shortFormVideos": [{"title": .title, "thumbnail": .thumbnail, "longDescription": .short_description, "shortDescription": .short_description, "id": .id, "releaseDate": .timestamp, "genres": ["technology"], "tags": [.branch], "content": {"duration": .duration, "dateAdded": .timestamp, "videos": [{url: .hls_url, quality: "HD", videoType: "HLS", dateAdded: .publishdate,}]}}]}

When I use this, it drills down to .results[], it displays everything fine, but removes the comma inbetween the objects and adds my "provider name, language, updated and shortformvideos" to every object. Now I need to keep the commas inbetween the objects and only display the provider/language/date/shortform at the top of the file as I continue to manipulate the object to be in the correct format that ROKU wants.

This is what is displayed when I run my code...

{
  "providerName": "CrozTest",
  "language": "en-us",
  "lastUpdated": null,
  "shortFormVideos": [
    {
      "title": "Title",
      "thumbnail": null,
      "longDescription": null,
      "shortDescription": null,
      "id": "video:822667",
      "releaseDate": null,
      "genres": [
        "technology"
      ],
      "tags": [
        null
      ],
      "content": {
        "duration": 1851,
        "dateAdded": null,
        "videos": [
          {
            "url": "file",
            "quality": "HD",
            "videoType": "HLS",
            "dateAdded": null
          }
        ]
      }
    }
  ]
}
{
  "providerName": "CrozTest",
  "language": "en-us",
  "lastUpdated": null,
  "shortFormVideos": [
    {
      "title": "Title",
      "thumbnail": null,
      "longDescription": null,
      "shortDescription": null,
      "id": "video:822667",
      "releaseDate": null,
      "genres": [
        "technology"
      ],
      "tags": [
        null
      ],
      "content": {
        "duration": 1851,
        "dateAdded": null,
        "videos": [
          {
            "url": "file",
            "quality": "HD",
            "videoType": "HLS",
            "dateAdded": null
          }
        ]
      }
    }
  ]
}
{
  "providerName": "CrozTest",
  "language": "en-us",
  "lastUpdated": null,
  "shortFormVideos": [
    {
      "title": "Title",
      "thumbnail": null,
      "longDescription": null,
      "shortDescription": null,
      "id": "video:822667",
      "releaseDate": null,
      "genres": [
        "technology"
      ],
      "tags": [
        null
      ],
      "content": {
        "duration": 1851,
        "dateAdded": null,
        "videos": [
          {
            "url": "file",
            "quality": "HD",
            "videoType": "HLS",
            "dateAdded": null
          }
        ]
      }
    }
  ]
}

Now I'm just now starting to noodle around with jq, and this is what I'm trying to get...

      {
          "providerName": "CrozTest",
          "language": "en-us",
          "lastUpdated": "2021-11-21T19:24:03.750Z",
          "shortFormVideos": [
        {
                "category":"B-Roll",
                "aspect_ratio":"16:9",
                "duration":1851,
                "hd":true,
                "title":"Title",
                "id":"video:822667",
                "type":"video",
                "keywords":"removed",
                "credit":"Removed",
                "country":"United States",
                "city":"",
                "hls_url":"file",
          "id": "video:822412",
          "releaseDate": "2021-11-21T18:21:04.353Z",
          "genres": [
            "technology"
          ],
          "tags": [
            "tag"
          ],
          "content": {
            "duration": 160,
            "dateAdded": "2021-11-21T18:21:04.353Z",
            "videos": [
              {
                "url": "hls_url",
                "quality": "HD",
                "videoType": "HLS",
                "dateAdded": "2021-11-21T18:19:31Z"
              }
            ]
          }
        },
        {
                "category":"B-Roll",
                "aspect_ratio":"16:9",
                "duration":1851,
                "hd":true,
                "title":"Title",
                "id":"video:822667",
                "type":"video",
                "keywords":"removed",
                "credit":"Removed",
                "country":"United States",
                "city":"",
                "hls_url":"file",
          "id": "video:822412",
          "releaseDate": "2021-11-21T18:21:04.353Z",
          "genres": [
            "technology"
          ],
          "tags": [
            "tag"
          ],
          "content": {
            "duration": 160,
            "dateAdded": "2021-11-21T18:21:04.353Z",
            "videos": [
              {
                "url": "hls_url",
                "quality": "HD",
                "videoType": "HLS",
                "dateAdded": "2021-11-21T18:19:31Z"
              }
            ]
          }
        },
{
                "category":"B-Roll",
                "aspect_ratio":"16:9",
                "duration":1851,
                "hd":true,
                "title":"Title",
                "id":"video:822667",
                "type":"video",
                "keywords":"removed",
                "credit":"Removed",
                "country":"United States",
                "city":"",
                "hls_url":"file",
          "id": "video:822412",
          "releaseDate": "2021-11-21T18:21:04.353Z",
          "genres": [
            "technology"
          ],
          "tags": [
            "tag"
          ],
          "content": {
            "duration": 160,
            "dateAdded": "2021-11-21T18:21:04.353Z",
            "videos": [
              {
                "url": "hls_url",
                "quality": "HD",
                "videoType": "HLS",
                "dateAdded": "2021-11-21T18:19:31Z"
              }
            ]
          }
        } 
          ]
        }

CodePudding user response:

It is hard to test this since your input is missing almost all of its fields, but:

# header
{providerName: "CrozTest", language: "en-us", lastUpdated: (.results[].timestamp | max)}  
# the bit after the | is repeated for every element of results
{shortFormVideos: [.results[] |
  {title, thumbnail, id,
   longDescription: .short_description,
   shortDescription: .short_description,
   releaseDate: .timestamp,
   genres: ["technology"], 
   tags: [.branch],
   content:
    {duration,
     dateAdded: .timestamp, 
     videos: [{url: .hls_url,
               quality: "HD",
               videoType: "HLS",
               dateAdded: .publishdate}]}
}]}

To clarify what is wrong with your approach: .results[] | ... executes the filter once per element of results. The result of each filter is output as a separate JSON object – or as you call it – "stripping commas".

My approach embeds the .results[] inside the resulting object. If you find it more readable, you can also do .results | map({...}) | {providerName: ......., shortFormVideos: .}

  • Related