How to select multiple items with duplicate items using jq?-CodePudding

I have a JSON file which contains a list like this;

[{
    "host": "cat",
    "ip": "192.168.1.1",
    "id": "cherry"
}, {
    "host": "dog",
    "ip": "192.168.1.1",
    "id": "apple"
}, {
    "host": "cat",
    "ip": "192.168.1.2",
    "id": "banana"
}]

I want to collect IPs and print id and host next to it but if IP is used multiple times then print multiple id and host next to instead of a new line. IP and host can be the same for multiple items but id is unique.

So the final output should look like this;

$ echo <something>
192.168.1.1 cat cherry dog apple
192.168.1.2 cat banana

How can I do this using bash and jq?

CodePudding user response：

Make sure you have a valid JSON file: Remove the last comma , in each object to get this as your input.json:

[{
    "host": "cat",
    "ip": "192.168.1.1",
    "id": "cherry"
}, {
    "host": "dog",
    "ip": "192.168.1.1",
    "id": "apple"
}, {
    "host": "cat",
    "ip": "192.168.1.2",
    "id": "banana"
}]

Then, you only need one jq call:

jq --raw-output 'group_by(.ip)[] | [first.ip, (.[] | .host, .id)] | join(" ")' input.json

Demo

CodePudding user response：

Once you fix your example so it's valid JSON, the group_by function is the key:

$  jq -r 'group_by(.ip)[] | [.[0].ip, map(.host, .id)[]] | @tsv' input.json
192.168.1.1     cat     cherry  dog     apple
192.168.1.2     cat     banana

That will combine all objects with the same ip field into an array of objects. The rest is just turning those array of objects into arrays of just the values you want, and finally outputting each new array as a line of tab-separated values.