I have an awk array that aggregates bytes up and downloaded. I can sort the output by either bytes down or up and pipe that to head for the top talkers; is it possible to output two sorts using different keys?
zgrep ^1 20211014T00*.gz|awk '{print$3,$11,$6,$(NF-7)}'| awk 'NR>1{bytesDown[$1 " " $2] =$3;bytesUp[$1 " " $2] =$4} END {for(i in bytesDown) print bytesDown[i], bytesUp[i], i}'|sort -rn|head
Rather than parsing the source again to get the top uploads, I would like to be able to output the array again to "sort -rnk2|head".
I can see how I'd do it with a scratch file but is it possible/desirable to do it in memory? It's a bash shell on a 2 CPU Linux VM with 4GB of memory.
CodePudding user response:
Bash allows you to do that with process substitutions. It's not clear what you expect it to do with the data; printing both results to standard output is unlikely to be useful, so I send each to a separate file for later inspection.
zgrep ^1 20211014T00*.gz |
awk '{print$3,$11,$6,$(NF-7)}' |
awk 'NR>1{bytesDown[$1 " " $2] =$3;bytesUp[$1 " " $2] =$4}
END {for(i in bytesDown) print bytesDown[i], bytesUp[i], i}' |
tee >(sort -rn | head >first) |
sort -rnk2 | head >second
The double Awks could easily be refactored to a single Awk script. Something like this?
awk 'NR>1{bytesDown[$3 " " $11] =$6;bytesUp[$3 " " $11] =$(NF-7)}
END { for(i in bytesDown) print bytesDown[i], bytesUp[i], i }'
CodePudding user response:
Your question isn't clear and there's no sample input/output to test with but this MAY be what you're trying to do:
zgrep '^1' 20211014T00*.gz|
awk '
NR > 1 {
key = $3 " " $11
bytesdown[key] = $6
bytesup[key] = $(NF-7)
}
END {
cmd = "sort -rn | head"
for ( key in bytesDown ) {
print bytesDown[key], bytesUp[key], key | cmd
}
close(cmd)
cmd = "sort -rnk2 | head"
for ( key in bytesDown ) {
print bytesDown[key], bytesUp[key], key | cmd
}
close(cmd)
}
'
which could be written more concisely and efficiently as:
zgrep '^1' 20211014T00*.gz|
awk '
NR > 1 {
key = $3 " " $11
bytesdown[key] = $6
bytesup[key] = $(NF-7)
if ( NR == 2 ) {
max_bytesdown_key = key
max_bytesup_key = key
}
else {
if ( bytesdown[key] > bytesdown[max_bytesdown_key] ) {
max_bytesdown_key = key
}
if ( bytesup[key] > bytesup[max_bytesup_key] ) {
max_bytesup_key = key
}
}
}
END {
print bytesdown[max_bytesdown_key], bytesup[max_bytesdown_key], max_bytesdown_key
print bytesdown[max_bytesup_key], bytesup[max_bytesup_key], max_bytesup_key
}
'