Home > Blockchain >  How to merge multiple JSON files and average values
How to merge multiple JSON files and average values

Time:03-21

I try to merge different JSON files, which contain this kind of data:

#file1:
{
  "user1": {
    "server1": 7.2,
    "server2": 10.3
  },
  "user2": {
    "server1": 15
  }
}
#file2:
{
  "user1": {
    "server1": 8.5,
    "server3": 20.5
  },
  "user3": {
    "server1": 28
  }
}

And so on (12 per hour).

These data are a list of users with the servers usage in percentage, evolving every 5 min (user may dissapear and reappear depending of resources usage).

I am under Linux and I generate these files with different Shell tools like jq.

My goal is to obtain a merged file every hour with the average load on all available data, like that (just considering the 2 files above):

#result:
{
  "user1": {
    "server1": 7.85,
    "server2": 5.15,
    "server3": 10.25
  },
  "user2": {
    "server1": 7.5
  },
  "user3": {
    "server1": 14
  }
}

I try several queries with reduce command, but nothing works as expected.

Maybe the best approach should to parse every files, extract data and store user in different file and after merge all data?

Thank for your analyse and example.

CodePudding user response:

Here's a solution for an arbitrary number of input files. It first divides each value by the number of input files, and then adds them all up retaining the keys:

jq -s '
  .[][][] /= length
  | reduce (.[] | to_entries[]) as $u ({};
      reduce ($u.value | to_entries[]) as $s (.;
        .[$u.key][$s.key]  = $s.value
      )
    )
' file*
{
  "user1": {
    "server1": 7.85,
    "server2": 5.15,
    "server3": 10.25
  },
  "user2": {
    "server1": 7.5
  },
  "user3": {
    "server1": 14
  }
}

Demo


The division may also be shifted into the addition by using a variable at the beginning instead:

jq -s '
  length as $c
  | reduce (.[] | to_entries[]) as $u ({};
      reduce ($u.value | to_entries[]) as $s (.;
        .[$u.key][$s.key]  = $s.value / $c
      )
    )
' file*

Demo

CodePudding user response:

One approach:

jq -n \
   --argjson d1 file1.json \
   --argjson d2 file2.json \
'
   $d1 * $d2 |
   with_entries(
      .key as $user |
      .value = (
         with_entries(
            .key as $server |
            .value | ( $d1[ $user ][ $server ]   $d2[ $user ][ $server ] ) / 2
         )
      )
   )
'

Demo on jqplay

$d1 * $d2 creates the right structure, but with the wrong values. We then fix up the values.


Another approach:

jq -s '
   length as $n |
   reduce ( .[] | tostream | select( ( .[0] | length ) == 2 ) ) as $_ (
      { };
      .[ $_[0][0] ][ $_[0][1] ]  = ( $_[1] // 0 ) / $n
   )
' file1.json file2.json

Demo on jqplay

This one works with more than two files!

  • Related