Home > Enterprise >  JQ: Remove json objects from array
JQ: Remove json objects from array

Time:09-30

I have this json file with hundreds of entries that I need to strip from data I do not need. Snippet:

{
  "entries": [
    {
      "metadata": {
        "tags": [
        ]
      },
      "sys": {
        "space": {
          "sys": {
            "type": "Link",
            "linkType": "Space",
            "id": "9kn72w8zc6fh"
          }
        },
        "id": "vcLKKhJ3mZNfGMvVZZi07",
        "type": "Entry",
        "createdAt": "2021-05-20T15:14:01.358Z",
        "updatedAt": "2021-09-20T15:28:30.799Z",
        "environment": {
          "sys": {
            "id": "production",
            "type": "Link",
            "linkType": "Environment"
          }
        },
        "publishedVersion": 47,
        "publishedAt": "2021-09-20T15:28:30.799Z",
        "firstPublishedAt": "2021-05-25T10:26:56.722Z",
        "createdBy": {
          "sys": {
            "type": "Link",
            "linkType": "User",
            "id": "6F84RwUIY9cXNNXBoQemqX"
          }
        },
        "updatedBy": {
          "sys": {
            "type": "Link",
            "linkType": "User",
            "id": "6F84RwUIY9cXNNXBoQemqX"
          }
        },
        "publishedCounter": 4,
        "version": 48,
        "publishedBy": {
          "sys": {
            "type": "Link",
            "linkType": "User",
            "id": "6F84RwUIY9cXNNXBoQemqX"
          }
        },
        "contentType": {
          "sys": {
            "type": "Link",
            "linkType": "ContentType",
            "id": "page"
          }
        }
      },
      "fields": {
        "title": {
          "de-DE": "Startseite",
          "en-US": "Home"
        },
        "description": {
          "en-US": "foo"
        },
        "keywords": {
          "en-US": "bar"
        },
        "stageModules": {
          "en-US": [
            {
              "sys": {
                "type": "Link",
                "linkType": "Entry",
                "id": "11AfBBuNK8bx3EygAS3WTY"
              }
            }
          ]
        },
        "contentModules": {
          "en-US": [
            {
              "sys": {
                "type": "Link",
                "linkType": "Entry",
                "id": "7uyuyIBsXWApHqpR7Pgkac"
              }
            },
            {
              "sys": {
                "type": "Link",
                "linkType": "Entry",
                "id": "4HILHPLjqQkP2H1hA2FeBG"
              }
            },
            {
              "sys": {
                "type": "Link",
                "linkType": "Entry",
                "id": "QuwRHL3XMSkguqrL1hUzC"
              }
            },
            {
              "sys": {
                "type": "Link",
                "linkType": "Entry",
                "id": "4ZyVef5oWhQWXK9V1lr3vz"
              }
            }
          ]
        },
        "layout": {
          "en-US": "Wide"
        }
      }
    }
  ]
}

From the entries array, I actually only need:

  • entries.sys.id
  • entries.sys.contentType.sys.id
  • entries.fields

I came up with:

jq \
  '.entries | .[] .sys, .[] .fields | del(.createdAt, .createdBy, .environment, .firstPublishedAt, .metadata, .publishedAt, .publishedBy, .publishedCounter, .publishedVersion, .space, .type, .updatedAt, .updatedBy, .version)' \
  $infile >| $outfile

However, this changes the structure of the document. The entries node is missing (due to the .entries filter):

{
  "id": "vcLKKhJ3mZNfGMvVZZi07",
  "contentType": {
    "sys": {
      "type": "Link",
      "linkType": "ContentType",
      "id": "page"
    }
  }
}
{
  "id": "1UgOmHIvsWrFEf1VCa84kz",
  "contentType": {
    "sys": {
      "type": "Link",
      "linkType": "ContentType",
      "id": "moduleText"
    }
  }
}
{
  "title": {
    "de-DE": "Startseite",
    "en-US": "Home"
  },
  "description": {
    "en-US": "Foo"
  },
  "keywords": {
    "en-US": "Bar"
  },
  "stageModules": {
    "en-US": [
      {
        "sys": {
          "type": "Link",
          "linkType": "Entry",
          "id": "11AfBBuNK8bx3EygAS3WTY"
        }
      }
    ]
  },
  "contentModules": {
    "en-US": [
      {
        "sys": {
          "type": "Link",
          "linkType": "Entry",
          "id": "7uyuyIBsXWApHqpR7Pgkac"
        }
      },
      {
        "sys": {
          "type": "Link",
          "linkType": "Entry",
          "id": "4HILHPLjqQkP2H1hA2FeBG"
        }
      },
      {
        "sys": {
          "type": "Link",
          "linkType": "Entry",
          "id": "QuwRHL3XMSkguqrL1hUzC"
        }
      },
      {
        "sys": {
          "type": "Link",
          "linkType": "Entry",
          "id": "4ZyVef5oWhQWXK9V1lr3vz"
        }
      }
    ]
  },
  "layout": {
    "en-US": "Wide"
  }
}

I have 2 questions:

  1. How can I delete deeper objects, eg. .entries.sys.space.sys.linkType?
  2. How can I keep the .entries node in the outfile?

Thank you for your help.

CodePudding user response:

If you want full control over the output, I'd just re-create the desired format.

It sounds like you're trying to accieve the following format:

{
  "entries": [
    {
      "sys": {
        "id": ...
      },
      "contentType": {
        "sys": {
          "id": ...
        }
      },
      "fields": ...
      }
    }
  ]
}

We can achieve this by using the following JQ selector:

.entries |= map({ "sys": { "id": .sys.id }, "contentType": { "sys": { "id": .sys.contentType.sys.id } }, fields })
Try it online!
  • Related