Home > Back-end >  Sed/awk for String to integer conversion of a csv column in shell
Sed/awk for String to integer conversion of a csv column in shell

Time:10-28

I need 7th column of a csv file to be converted from float to decimal. It's a huge file and I don't want to use while read for conversion. Any shortcuts with awk?

Input:

"xx","x","xxxxxx","xxx","xx","xx"," 00000001.0000"  
"xx","x","xxxxxx","xxx","xx","xx"," 00000002.0000"  
"xx","x","xxxxxx","xxx","xx","xx"," 00000005.0000"  
"xx","x","xxxxxx","xxx","xx","xx"," 00000011.0000"  

Output:

"xx","x","xxxxxx","xxx","xx","xx","1"  
"xx","x","xxxxxx","xxx","xx","xx","2"  
"xx","x","xxxxxx","xxx","xx","xx","5"   
"xx","x","xxxxxx","xxx","xx","xx","11" 

Tried these, worked. But anything simpler ?

awk 'BEGIN {FS=OFS="\",\""} {$7 = sprintf("%.0f", $7)} 1' $test > $test1
awk '{printf("%s\"\n", $0)}' $test1

CodePudding user response:

With your shown samples, please try following awk program.

awk -v s1="\"" -v OFS="," '{$NF = s1 ($NF   0) s1} 1' Input_file

Explanation: Simple explanation would be, setting OFS as , then in main program; in each line's last field keeping only digits and covering last field with ", re-shuffle the fields and printing edited/non-edited all lines.

CodePudding user response:

awk 'BEGIN{FS=OFS=","} {gsub(/"/, "", $7); $7="\"" $7 0 "\""; print}' file

Output:

"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"

gsub(/"/, "", $7): removes all " from $7

$7 0: Reduces the number in $7 to minimal representation

CodePudding user response:

Another simple awk solution:

awk 'BEGIN {FS=OFS="\",\""} {$NF = $NF 0 "\""} 1' file

"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"
  • Related