Home > Blockchain >  JQ get unique objects and arrays based on key value
JQ get unique objects and arrays based on key value

Time:11-19

Is there a way to return a unique object/array when duplicates exist? Here's what I'm trying to do.

I have a payload like this:

{
  "data": [
    {
      "account": "12xUoMKwf12ABjNx4VCvYcNkX79gW1kzz2JnBLxkFbjswRczRvM",
      "amount": 7885016,
      "block": 470788,
      "gateway": "113kQU96zqePySTahB7PEde9ZpoWK76DYK1f57wyhjhXCBoAu88",
      "hash": "DTU1GGfR0eU15hv6KiV_bg6FOJXfUWz4TjIq1H7TGy4",
      "timestamp": "2020-08-28T01:29:46.000000Z"
    }
  ]
}
{
  "data": [
    {
      "account": "12xUoMKwf12ABjNx4VCvYcNkX79gW1kzz2JnBLxkFbjswRczRvM",
      "amount": 7885016,
      "block": 470788,
      "gateway": "113kQU96zqePySTahB7PEde9ZpoWK76DYK1f57wyhjhXCBoAu88",
      "hash": "DTU1GGfR0eU15hv6KiV_bg6FOJXfUWz4TjIq1H7TGy4",
      "timestamp": "2020-08-28T01:29:46.000000Z"
    }
  ]
}
{
  "data": [
    {
      "account": "12xUoMKwf12ABjNx4VCvYcNkX79gW1kzz2JnBLxkFbjswRczRvM",
      "amount": 8623955,
      "block": 470509,
      "gateway": "113kQU96zqePySTahB7PEde9ZpoWK76DYK1f57wyhjhXCBoAu88",
      "hash": "5fQJY9MprH9b3IstVU1SdfBteUWoF_sdsVuiARPBtTY",
      "timestamp": "2020-08-27T19:01:48.000000Z"
    }
  ]
}

As you can see, the first 2 payloads are identical and the last one is unique. I need to get the unique objects and then sum up the .amount when they fall below a certain time period. Here's what I have so far

jq --arg this "$(date  %Y-%m-%dT%H:%M:%S)" '.data[] | select(.timestamp >= $this) | .amount'

Which gives me the amounts so I can sum them up but, it also contains the duplicates. What I would like to do is get the objects that are unique by their .hash The idea is to sum up the total amounts that fall within the given date

Thanks in advance

CodePudding user response:

What I would like to do is get the objects that are unique by their .hash

One way to remove the duplicates would be to use unique_by/1 in conjunction with the -s command-line option.

Assuming you want all the items in all the .data arrays you could start your pipeline with:

jq -s 'map(.data[]) | unique_by(.hash) ...' 

However, since you are really only interested in the .timestamp and .amount fields, it would be more efficient to proceed along the following lines:

jq -s --arg this "$(date  %Y-%m-%dT%H:%M:%S)" '
  map(.data[] | select(.timestamp >= $this) | {hash, amount})
  | unique_by(.hash)[]
  | .amount
' input.json


  • Related