Home > Software design >  Remove duplicate JSON blocks from file using JQ
Remove duplicate JSON blocks from file using JQ

Time:07-07

I have a JSON file that contains thousands of entries, and i need to remove the duplicate blocks.

Here is an example of the file:

{ "signatures": [
   {
     "signatureId": 0050,
     "mode": 0
   },
   {
     "signatureId": 0012,
     "mode": 0
   },
   {
     "signatureId": 0012,
     "mode": 1
   }
]}

Here is the target result to achieve:

{ "signatures": [
   {
     "signatureId": 0050,
     "mode": 0
   },
   {
     "signatureId": 0012,
     "mode": 0
   }
]}

And as you see, the "mode" value doesnt matter, what really matters is that the "signatureId" must not be duplicate, so when we remove the whole block, which ever "mode" stays, its not a problem.

I can only use Shell and/or JQ.

CodePudding user response:

Use unique_by with the field to be checked for duplicates as its argument. It will always take the first of a kind (here, the one with "mode": 0)

jq '.signatures |= unique_by(.signatureId)'
{
  "signatures": [
    {
      "signatureId": 12,
      "mode": 0
    },
    {
      "signatureId": 50,
      "mode": 0
    }
  ]
}

Demo

  • Related