Home > front end >  JQ write each object to subdirectory file
JQ write each object to subdirectory file

Time:08-20

I'm new to jq (around 24 hours). I'm getting the filtering/selection already, but I'm wondering about advanced I/O features. Let's say I have an existing jq query that works fine, producing a stream (not a list) of objects. That is, if I pipe them to a file, it produces:

{
  "id": "foo"
  "value": "123"
}
{
  "id": "bar"
  "value": "456"
}

Is there some fancy expression I can add to my jq query to output each object individually in a subdirectory, keyed by the id, in the form id/id.json? For example current-directory/foo/foo.json and current-directory/bar/bar.json?

CodePudding user response:

As @pmf has pointed out, an "only-jq" solution is not possible. A solution using jq and awk is as follows, though it is far from robust:

<input.json jq -rc '.id, .' | awk '
  id=="" {id=$0; next;}
  { path=id; gsub(/[/]/, "_", path);
    system("mkdir -p " path);
    print >> path "/" id ".json";
    id="";
  }
'

CodePudding user response:

As you will need help from outside jq anyway (see @peak's answer using awk), you also might want to consider using another JSON processor instead which offers more I/O features. One that comes to my mind is mikefarah/yq, a jq-inspired processor for YAML, JSON, and other formats. It can split documents into multiple files, and since its v4.27.2 release it also supports reading multiple JSON documents from a single input source.

$ yq -p=json -o=json input.json -s '.id'

$ cat foo.json
{
  "id": "foo",
  "value": "123"
}

$ cat bar.json
{
  "id": "bar",
  "value": "456"
}

The argument following -s defines the evaluation filter for each output file's name, .id in this case (the .json suffix is added automatically), and can be manipulated to further needs, e.g. -s '"file_with_id_" .id'. However, adding slashes will not result in subdirectories being created, so this (from here on comparatively easy) part will be left over for post-processing in the shell.

  • Related