Home > other >  AWK script- Not showing data
AWK script- Not showing data

Time:05-22

I'm trying to create a variable to sum columns 26 to 30 and 32. SO far I have this code which prints me the hearder and the output format like I want but no data is being shown.

#! /usr/bin/awk -f

BEGIN { FS="," }
NR>1 {
    TotalPositiveStats= ($26 $27 $28 $29 $30 $32)
    }
        {printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s, %s\n",
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats
        }

NR==1   {
    print "EndYear,Rk,G,Date,Years,Days,Age,Tm,HOme,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats" }#header

Input data:

EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc
1985,1,1,10/26/1984,21,252,21.6899384,CHI,1,WSB,1,16,1,40,5,16,0.313,0,0,,6,7,0.857,1,5,6,7,2,4,5,2,16,12.5
1985,2,2,10/27/1984,21,253,21.69267625,CHI,0,MIL,0,-2,1,34,8,13,0.615,0,0,,5,5,1,3,2,5,5,2,1,3,4,21,19.4
1985,3,3,10/29/1984,21,255,21.69815195,CHI,1,MIL,1,6,1,34,13,24,0.542,0,0,,11,13,0.846,2,2,4,5,6,2,3,4,37,32.9
1985,4,4,10/30/1984,21,256,21.7008898,CHI,0,KCK,1,5,1,36,8,21,0.381,0,0,,9,9,1,2,2,4,5,3,1,6,5,25,14.7
1985,5,5,11/1/1984,21,258,21.7063655,CHI,0,DEN,0,-16,1,33,7,15,0.467,0,0,,3,4,0.75,3,2,5,5,1,1,2,4,17,13.2
1985,6,6,11/7/1984,21,264,21.72279261,CHI,0,DET,1,4,1,27,9,19,0.474,0,0,,7,9,0.778,1,3,4,3,3,1,5,5,25,14.9
1985,7,7,11/8/1984,21,265,21.72553046,CHI,0,NYK,1,15,1,33,15,22,0.682,0,0,,3,4,0.75,4,4,8,5,3,2,5,2,33,29.3

Output expected:

EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats
1985,1,1,10/26/1984,21,252,21.6899384,CHI,1,WSB,1,16,1,40,5,16,0.313,0,0,,6,7,0.857,1,5,6,7,2,4,5,2,16,12.5,35
1985,2,2,10/27/1984,21,253,21.69267625,CHI,0,MIL,0,-2,1,34,8,13,0.615,0,0,,5,5,1,3,2,5,5,2,1,3,4,21,19.4,34
1985,3,3,10/29/1984,21,255,21.69815195,CHI,1,MIL,1,6,1,34,13,24,0.542,0,0,,11,13,0.846,2,2,4,5,6,2,3,4,37,32.9,54
1985,4,4,10/30/1984,21,256,21.7008898,CHI,0,KCK,1,5,1,36,8,21,0.381,0,0,,9,9,1,2,2,4,5,3,1,6,5,25,14.7,38
1985,5,5,11/1/1984,21,258,21.7063655,CHI,0,DEN,0,-16,1,33,7,15,0.467,0,0,,3,4,0.75,3,2,5,5,1,1,2,4,17,13.2,29
1985,6,6,11/7/1984,21,264,21.72279261,CHI,0,DET,1,4,1,27,9,19,0.474,0,0,,7,9,0.778,1,3,4,3,3,1,5,5,25,14.9,36
1985,7,7,11/8/1984,21,265,21.72553046,CHI,0,NYK,1,15,1,33,15,22,0.682,0,0,,3,4,0.75,4,4,8,5,3,2,5,2,33,29.3,51

This script will be called like gawk -f script.awk <filename>.

Currently when calling this is the output (It seems to be calculating the variable but the rest of fields are empty) enter image description here

CodePudding user response:

awk is well suited to summing columns:

awk 'NR>1{$(NF 1)=$26 $27 $28 $29 $30 $32}1' FS=, OFS=, input-file > tmp
mv tmp input-file

That doesn't add a field in the header line, so you might want something like:

awk '{$(NF 1) = NR>1 ? ($26 $27 $28 $29 $30 $32) : "TotalPositiveStats"}1' FS=, OFS=,

CodePudding user response:

The current printf ...

{printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s, %s\n",
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats}

... is referencing (awk) variables that have not been defined (eg, EndYear, Rk, G). [NOTE: one exeception is the very last variable in the list - TotalPositiveStats - which has in fact been defined earlier in the script.]

The default value for undefined variables is the empty string ("") or zero (0), depending on how the awk code is referencing the variable, eg:

  • printf "%s", EndYear => EndYear is treated as a string and the printed result is an empty string; with an output field delimiter of a comma (,) this empty strings shows up as 2 commas next to each other (,,)
  • printf "%.2f %", FG_PCT => FG_PCT is treated as a numeric (because of the %f format) and the printed result is 0.00 %

Where it gets a little interesting is when the (undefined) variable name starts with a numeric (eg, 3P) in which case the P is ignored and the entire reference is treated as a number, eg:

  • printf "%s", 3P => 3P is processed as 3 and the printed result is 3

This should explain the 5 static values (0.00 %, 3, 3, 3.00 % and 0.00 %) printed in all output lines as well as the 'missing' values between the rest of the commas (eg, ,,,,).

Obviously the last value in the line is an actual number, ie, the value of the awk variable TotalPositiveStats.


If the intent is to reformat the input values then you want to replace all of the undefined variable references with their actual field references (eg, replace EndYear with $1).

  • Related