Home > Enterprise >  awk field-separator problem with last value
awk field-separator problem with last value

Time:08-07

I have a file "EURUSD.csv" file that contains:

timestamp,open,high,low,close
2022-08-04,1.01578,1.01979,1.01530,1.01892
2022-08-03,1.01636,1.02105,1.01210,1.01691
2022-08-02,1.02578,1.02939,1.01619,1.01625

I have a bash script:

#!/bin/bash

cat EURUSD.csv  | tail -n  2| while read line
 do
 DATA=`echo $line | awk -F, '{ print $1 }' `
 OPEN=`echo $line | awk -F, '{ print $2 }' `
 HIGH=`echo $line | awk -F, '{ print $3 }' `
 LOW=`echo $line | awk -F, '{ print $4 }' `
 CLOSE=`echo $line | awk -F, '{ print $5 }'  `

 echo "insert into forex ($DATA, $OPEN, $HIGH, $LOW, $CLOSE)"
 done

I have a problem with $5 CLOSE value, when used, the output is not like I aspect:

)nsert into forex (2022-08-04, 1.01578, 1.01979, 1.01530, 1.01892
)nsert into forex (2022-08-03, 1.01636, 1.02105, 1.01210, 1.01691
)nsert into forex (2022-08-02, 1.02578, 1.02939, 1.01619, 1.01625

If I try the script without CLOSE values, the output is OK:

insert into forex (2022-08-04, 1.01578, 1.01979, 1.01530)
insert into forex (2022-08-03, 1.01636, 1.02105, 1.01210)
insert into forex (2022-08-02, 1.02578, 1.02939, 1.01619)

What is the problem with $5 CLOSE value?

CodePudding user response:

With your shown samples please try following awk code.

awk '
BEGIN{
  FS=","
  OFS=", "
}
FNR==1{ next }
{
  sub(/\r/,"")
  $1=$1
  print "insert into forex (" $0 ")"
}
'  Input_file

Explanation: Adding detailed explanation for above code.

awk '              ##Starting awk program from here.
BEGIN{             ##Starting BEGIN section of this program from here.
  FS=","           ##Setting FS to comma here.
  OFS=", "         ##Setting OFS to comma space here.
}
FNR==1{ next }     ##If its first line then simply go to next line.
{
  sub(/\r/,"")     ##Using sub to substitute \r characters at last of line.
  $1=$1            ##Re-assigning 1st field to itself here.
  print "insert into forex (" $0 ")" ##Printing insert into forex ( followed by current line ) here.
}
'  Input_file      ##Mentioning Input_file name here.

CodePudding user response:

Don't use shell loop to read your CSV file. Better to use this awk that also removed DOS line break before creating SQL statements using CSV data ignoring header row:

awk '
BEGIN {FS=OFS=","}
NR > 1 {
   sub(/\r$/, "")
   printf "insert into forex (%s, %s, %s, %s, %s)\n", $1, $2, $3, $4, $5
}' file.csv

insert into forex (2022-08-04, 1.01578, 1.01979, 1.01530, 1.01892)
insert into forex (2022-08-03, 1.01636, 1.02105, 1.01210, 1.01691)
insert into forex (2022-08-02, 1.02578, 1.02939, 1.01619, 1.01625)

CodePudding user response:

Simple concatenation is all that you need, e.g.

awk 'FNR==1{next} {$0 = "insert into forex (" $0 ")"}1' EURUSD.csv

Output

insert into forex (2022-08-04,1.01578,1.01979,1.01530,1.01892)
insert into forex (2022-08-03,1.01636,1.02105,1.01210,1.01691)
insert into forex (2022-08-02,1.02578,1.02939,1.01619,1.01625)

If you need the additional space, just add a gsub() call, e.g.

awk 'FNR==1{next} {$0 = "insert into forex (" $0 ")"; gsub(/,/,", ")}1' EURUSD.csv

Output

insert into forex (2022-08-04, 1.01578, 1.01979, 1.01530, 1.01892)
insert into forex (2022-08-03, 1.01636, 1.02105, 1.01210, 1.01691)
insert into forex (2022-08-02, 1.02578, 1.02939, 1.01619, 1.01625)

CodePudding user response:

one single call to sub() per row is all u need by leveraging as many built-in variables as possible:

mawk 'sub(". ","insert into forex(&)",$!(NF*=!_<NR))' RS='\r?\n' FS=',' OFS=', ' 

|

insert into forex(2022-08-04, 1.01578, 1.01979, 1.01530, 1.01892)
insert into forex(2022-08-03, 1.01636, 1.02105, 1.01210, 1.01691)
insert into forex(2022-08-02, 1.02578, 1.02939, 1.01619, 1.01625)
  • Related