Home > Back-end >  merge every two rows in one Sum multiple entries
merge every two rows in one Sum multiple entries

Time:01-24

I am bit struggling with the output,as i need to merge every second row with first , sort and add up all the multiple entries.

sample output:

bittorrent_block(PCC)
127
default_384k(PCC)
28
default_384k(BWM)
28
bittorrent_block(PCC)
127
default_384k(PCC)
28
default_384k(BWM)
28

Convert 2nd row into Column (expected)

bittorrent_block(PCC): 127
 default_384k(PCC):  28
 default_384k(BWM): 28
 bittorrent_block(PCC): 127
 default_384k(PCC):  28
 default_384k(BWM): 28

Sum all duplicate entries (expected)

bittorrent_block(PCC): 254
 default_384k(PCC): 56
 default_384k(BWM): 56

These are the possible piece of code I tried. what I am finally getting as

    zcat file.tar.gz | awk 'NR%2{v=$0;next;}{print $0,v}'
     bittorrent_block(PCC)
     default_384k(PCC)
     default_384k(BWM)
     default_mk1(PCC)
     default_mk1_10m(PCC)


zcat file.tar.gz |awk 'NR%2{ prev = $0; next }{ print prev, $0;}
 127orrent_block(PCC)
 28ault_384k(PCC)
 28ault_384k(BWM)

Due to this, I am not able, to sum up duplicate values. Please help.

CodePudding user response:

I often find it easier to transform the input first and then process it. paste helps to convert consecutive lines into columns; then summing the numbers with awk becomes trivial:

$ <input paste -sd'\t\n' | awk '{sum[$1]  = $2}END{for(s in sum) print s": "sum[s]}'
bittorrent_block(PCC): 254
default_384k(PCC): 56
default_384k(BWM): 56

CodePudding user response:

It seems like you got CRLF in your file, so you'll have to strip them:

zcat file.tar.gz |
awk -F '\r' -v OFS=': ' '
    NR % 2 { id = $1; next }
    { sum[id]  = $1 }
    END { for (id in sum) print id, sum[id] }
'
bittorrent_block(PCC): 254
default_384k(PCC): 56
default_384k(BWM): 56
  • Related