Home > Mobile >  How to average a repeating interval of a row with Awk/Bash
How to average a repeating interval of a row with Awk/Bash

Time:11-08

I have a txt file that shows the average sunspot data for each month of the year between 1749 and 2005.

 (* Month: 1749 01 *) 58
 (* Month: 1749 02 *) 63
 (* Month: 1749 03 *) 70
 (* Month: 1749 04 *) 56
 (* Month: 1749 05 *) 85
 (* Month: 1749 06 *) 84
 (* Month: 1749 07 *) 95
 (* Month: 1749 08 *) 66
 (* Month: 1749 09 *) 76
 (* Month: 1749 10 *) 76
 (* Month: 1749 11 *) 159
 (* Month: 1749 12 *) 85
 (* Month: 1750 01 *) 73
 (* Month: 1750 02 *) 76
 (* Month: 1750 03 *) 89
 (* Month: 1750 04 *) 88
 Etc.

I need to average the 12 months for each year. So 1749 should equal 81. Averaging the $6 row with awk seems to be simple.

awk ' {sum  = $6} 
END { print sum/ NR } ' sunspot.txt

However, I don't know where to start as far as using control structures in Awk to incrementally average each of the 12 numbers for the years between 1749 and 2005.

CodePudding user response:

Here's one way:

awk '{a[$3]  = $6; b[$3]  = 1} END{for (i in a) print i, a[i]/b[i]}' years.txt | sort -n

Below shows first averaging by months, then by years, for illustration. This is using awk's built-in arrays capabilities - where the "a" array stores the summation, and the "b" keeps an increment count, which is used at end for division of the sum to compute the average.

$ awk '{a[$4]  = $6; b[$4]  = 1} END{for (i in a) print i, a[i]/b[i]}' years.txt | sort -n
01 65.5
02 69.5
03 79.5
04 72
05 85
06 84
07 95
08 66
09 76
10 76
11 159
12 85

$ awk '{a[$3]  = $6; b[$3]  = 1} END{for (i in a) print i, a[i]/b[i]}' years.txt | sort -n
1749 81.0833
1750 81.5

  • Related