Home > Blockchain >  Awk average of column by moving difference of grouping column variable
Awk average of column by moving difference of grouping column variable

Time:10-09

I have a file that look like this:

1 snp1 0.0 4
1 snp2 0.2 6
1 snp3 0.3 4
1 snp4 0.4 3
1 snp5 0.5 5
1 snp6 0.6 6
1 snp7 1.3 5
1 snp8 1.3 3
1 snp9 1.9 4

File is sorted by column 3. I want the average of 4th column grouped by column 3 every 0.5 unit apart. For example it should output like this:

1 snp1 0.0 4.4
1 snp6 0.6 6.0
1 snp7 1.3 4.0
1 snp9 1.9 4.0

I can print all positions without average like this:

awk 'NR==1 {pos=$3; print $0} $3>=pos 0.5{pos=$3; print $0}' input

But I am not able to figure out how to print average of 4th column. It would be great if someone can help me to find solution to this problem. Thanks!

CodePudding user response:

Something like this, maybe:

awk '
  NR==1 {c1=$1; c2=$2; v=$3; n=1; s=$4; next}
  $3>v 0.5 {print c1, c2, v, s/n; c1=$1; c2=$2; v=$3; n=1; s=$4; next}
  {n =1; s =$4}
  END {print c1, c2, v, s/n}
' input
  • Related