Need to the calculate % change between consecutive values in second column of a data file using awk
.
The data file looks like -
-bash-4.2$ cat coun11.csv
16-SEP-21,4218596
17-SEP-21,3747178
Since the number of rows will remain fixed, i.e., 2, I don't want to use while
with read
to loop over the file, instead I want to use a single line statement awk
, if possible.
The more i read on it in relation with awk
, it seemed possible but didn't get the desired result.
Answer should 11.2%
(can be negative) i.e., rounded off by two decimal places for which we can use OFMT
i believe.
% diff should always be calculated with ((oldvalue-newvalue)/oldvalue)*100
, wherein oldvalue refers to first row and newvalue to second row, which is evident from dates in first column.
One of the used commands awk 'BEGIN { NF==1?a=$2:b=$2}{c=(((a-b)/a)*100)} END {OFMT="%f";print c}' coun11.csv
, which didn't work
CodePudding user response:
OP's current code:
awk 'BEGIN { NF==1?a=$2:b=$2}{c=(((a-b)/a)*100)} END {OFMT="%f";print c}' coun11.csv
A few problems:
BEGIN {...}
processing is performed before the file is opened/read so ...NF=0
within theBEGIN {...}
block so ...a
andb
are undefined coming out of theBEGIN {...}
block (NOTE: inawk
undefined variables have a default value of 0)NF
refers to the number of fields in the current line but OP is looking to process specific lines 1 & 2 so ...- I'm guessing OP wants to use the
FNR
(aka line number of current record) instead - for each line read from the file
awk
tries to applyc=(((a-b)/a)*100)
, but sincea==b==0
OP should be getting an error (eg,... fatal: division by zero attempted
) - the calculation should only be performed once and can either be performed as part of processing the 2nd line (
FNR==2
) or simply moved to theEND {...}
block OFMT="%f"
is not going to generate the desired output format of11.2%
; while OP has mentioned 2 decimal places the expected output only shows 1 decimal place; we'll look at formats to generate both; after that OP is free to experiment withOFMT
as needed
One idea for some edits (and assuming OPs calculation is correct):
awk -F',' '
FNR==1 { a=$2 }
FNR==2 { b=$2 }
END { c=(((a-b)/a)*100)
printf "%.1f%\n", c # 1 decimal place
}
' coun11.csv
# or as a single line
awk -F',' 'FNR==1{a=$2}FNR==2{b=$2} END{c=(((a-b)/a)*100); printf "%.1f%\n", c}' coun11.csv
# using ternary for the test/assignment:
awk -F',' '
{ (FNR==1) ? (a=$2) : (b=$2) } # if FNR > 2 b will be overwritten with latest field #2
END { c=(((a-b)/a)*100)
printf "%.1f%\n", c
}
' coun11.csv
# or as a single line
awk -F',' '{(FNR==1)?(a=$2):(b=$2)} END{c=(((a-b)/a)*100); printf "%.1f%\n", c}' coun11.csv
All of these generate:
11.2%
For 2 decimal places we want printf "%.2f%\n",c
which generates:
11.17%
CodePudding user response:
I think your calculation is a bit off. I would do:
printf "foo %d\n" 4218596 3747178 4218596 |
awk 'NR>1 {printf "%.1f\n", ($2/d) * 100 - 100} {d=$2}'