I have a text file that contain timing of an experiment as
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
>
0.000
0.010
0.020
0.030
0.040
0.050
I want to equalize each block by finding out the maximum value of whole input data and then based on the increment on the data(calculated automatically from input) the individual block should be equalized to maximum value present in the input data and expected output is
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.040
0.050
I tried the code as given below but it gives result by repeating the last line of second block as given below which is not same as expected output
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.030
0.030
>
0.000
0.010
0.020
0.030
0.040
0.050
My script
awk '$0==">" {
if (c && c>max)
max = c
n
c = 0
next
}
{
r[n][ c] = $0
}
END {
for (i=1; i<=n; i) {
print ">"
for (j=1; j<=(max>c?max:c); j){
print (r[i][j] == "" ? prev : r[i][j])
prev=r[i][j]==""?prev:r[i][j]
}
}
}' input
This code gives the output by repeating the last line values of second block as i am doing some mistake.I hope experts may help overcoming this problem.Thanks in advance.
CodePudding user response:
If this isn't all you need:
$ cat tst.awk
NR==FNR {
if ( !/>/ && ((max == "") || ($1 > max)) ) {
max = $1
}
if ( (prev == prev 0) && ($1 == $1 0) ) {
step = $1 - prev
}
prev = $1
next
}
/>/ {
if ( FNR > 1 ) {
prt()
}
}
{
print
prev = $1
}
END { prt() }
function prt( i) {
for ( i=prev step; i<=max; i =step ) {
printf "%.03f\n", i
}
}
$ awk -f tst.awk file file
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.040
0.050
then edit your question to clarify your requirements and provide more truly representative sample input/output including cases the above doesn't work for.
CodePudding user response:
Assumptions/Understandings:
- all blocks start with the same value (eg,
0.000
) - all values are incremented by the same amount (eg,
0.010
) - all blocks are to be expanded to include all values
- net objective is to display the same block
N
times (whereN
is the number of blocks)
One awk
idea requiring a single pass of the input file:
awk '
/^>/ { blkcnt ; next }
!($1 in seen) { seen[$1]; arr[ n]=$1 }
END { for (i=1;i<=blkcnt;i ) {
print ">"
for (j=1;j<=n;j )
print arr[j]
}
}
' input
This generates:
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.040
0.050
>
0.000
0.010
0.020
0.030
0.040
0.050
NOTE:
- I'm guessing there's more to OP's real life input file (eg, each line contains additional textual data following the timing value) in which case this solution will not suffice
- if this is the case then OP should update the question with a more representative set of sample data