Home > Software engineering >  Adding double quotes around non-numeric columns by awk
Adding double quotes around non-numeric columns by awk

Time:05-18

I have a file like this;

2018-01-02;1.5;abcd;111
2018-01-04;2.75;efgh;222
2018-01-07;5.25;lmno;333
2018-01-09;1.25;prs;444

I'd like to add double ticks to non-numeric columns, so the new file should look like;

"2018-01-02";1.5;"abcd";111
"2018-01-04";2.75;"efgh";222
"2018-01-07";5.25;"lmno";333
"2018-01-09";1.25;"prs";444

I tried this so far, know that this is not the correct way

head myfile.csv -n 4 | awk 'BEGIN{FS=OFS=";"} {gsub($1,echo $1 ,$1)} 1' | awk 'BEGIN{FS=OFS=";"} {gsub($3,echo "\"" $3 "\"",$3)} 1' 

Thanks in advance.

CodePudding user response:

You may use this awk that sets ; as input/output delimiter and then wraps each field with "s if that field is non-numeric:

awk '
BEGIN {
   FS = OFS = ";"
}
{
   for (i=1; i<=NF;   i)
      $i = ($i 0 == $i ? $i : "\"" $i "\"")
} 1' file

"2018-01-02";1.5;"abcd";111
"2018-01-04";2.75;"efgh";222
"2018-01-07";5.25;"lmno";333
"2018-01-09";1.25;"prs";444

Alternative gnu-awk solution:

awk -v RS='[;\n]' '$0 0 != $0 {$0 = "\"" $0 "\""} {ORS=RT} 1' file

CodePudding user response:

Using GNU awk and typeof(): Fields - - that are numeric strings have the strnum attribute. Otherwise, they have the string attribute.1

$ gawk 'BEGIN {
    FS=OFS=";"
}
{
    for(i=1;i<=NF;i  )
        if(typeof($i)=="string")
            $i=sprintf("\"%s\"",$i)
}1' file

Some output:

"2018-01-02";1.5;"abcd";111
- - 

Edit:

If some the fields are already quoted:

$ gawk 'BEGIN {
    FS=OFS=";"
}
{
    for(i=1;i<=NF;i  )
        if(typeof($i)=="string")
            gsub(/^"?|"?$/,"\"",$i)
}1'  <<< string,123,"quoted string"

Output:

"string",123,"quoted string"
  • Related