Home > Software engineering >  jq - remove non-matching fields in "object-array-with-objects"
jq - remove non-matching fields in "object-array-with-objects"

Time:12-14

Given the following JSON-object:

{
  "meta": {
    "data1": {
      "keep": { "key": "value" }
    }
  },
  "detail": {
    "data2": [
      {
        "keep1": "keep1value",
        "keep2": "keep2value",
        "nokeep1": "abc"
      }
    ],
    "data3": [
      {
        "keep1": "keep1value",
        "keep2": "keep2value",
        "nokeep2": { "abc": "def" }
      }
    ]
  },
  "drop" : "this"
}

I'm trying to clean it by removing unwanted fields, like "remove", "nokeep1" and "nokeep2". However objects in the "data2" and "data3" arrays might contain more fields than the example "nokeepX", but will always contain "keep1" and "keep2" which I want to keep.

My desired output is the following JSON:

{
    "meta": { "data1": { "keep": { "key": "value" } } },
    "detail": {
        "data2": [
            {
                "keep1": "keep1value",
                "keep2": "keep2value"
            }
        ],
        "data3": [
            {
                "keep1": "keep1value",
                "keep2": "keep2value"
            }
        ]
    }
}

I've managed to remove the "drop" field with this query:
jq 'def pick($paths): . as $root | reduce ($paths[]|[.]|flatten(1)) as $path ({}; . setpath($path; $root|getpath($path))); pick([["meta"], ["detail", "data2"], ["detail", "data3"]])'

However I've been struggling to figure out how to remove the "nokeepX" fields - is it possible to accomplish this?

CodePudding user response:

Just provide all the concrete paths to del:

del(
  .detail.data2[0].nokeep1,
  .detail.data3[0].nokeep2,
  .drop
)

Demo

Or generalize by e.g. traversing all array items (not just the first) using [] without indices:

del(
  .detail.data2[].nokeep1,
  .detail.data3[].nokeep2,
  .drop
)

Demo

Or go arbitrarily deep using .., and just provide the deepest field names:

del(.. | objects | .nokeep1, .nokeep2, .drop)

Demo

Output:

{
  "meta": {
    "data1": {
      "keep": "true"
    }
  },
  "detail": {
    "data2": [
      {
        "keep1": "keep1value",
        "keep2": "keep2value"
      }
    ],
    "data3": [
      {
        "keep1": "keep1value",
        "keep2": "keep2value"
      }
    ]
  }
}

For the other way round, you could list all the leaf paths using paths(scalars), filter out those where the deepest level .[-1] does not match your criteria, and use delpaths to remove the remaining leafs:

delpaths([paths(scalars) | select(
  .[-1] | IN("keep", "keep1", "keep2") | not
)])

Demo

CodePudding user response:

If you have only a limited set of properties, it could be easier not to remove unwanted fields, but create the output from the required fields only:

{
    meta,
    detail: .detail | {
        data2: .data2 | map({ keep1, keep2 }),
        data3: .data3 | map({ keep1, keep2 })
    }
}

Output:

{
  "meta": {
    "data1": {
      "keep": {
        "key": "value"
      }
    }
  },
  "detail": {
    "data2": [
      {
        "keep1": "keep1value",
        "keep2": "keep2value"
      }
    ],
    "data3": [
      {
        "keep1": "keep1value",
        "keep2": "keep2value"
      }
    ]
  }
}

The approach can be combined with dropping certain fields:

{
    meta,
    detail: .detail | {
        data2: .data2 | map(del(.nokeep1)),
        data3: .data3 | map(del(.nokeep2))
    }
}

producing the same output as above.

  • Related