Home > Software engineering >  Trim extra spaces in AWK
Trim extra spaces in AWK

Time:02-17

I have this AWK script.

awk -v line="    foo    bar  " 'END
 {
   gsub(/^  |  $/,"", line);
   gsub(/ {2,}/, " ", line);
   print line
 }' \
somefile.txt

The input file (somefile.txt) is irrelevant to my question. The part that goes after the END pattern is there to trim extra spaces in the line variable and print it out. Like this:

foo bar

I'm trying to see if there is a better, more compact way to do that in AWK. Using gsub to remove a couple of extra spaces is very cumbersome. It is hard to read and hard for a maintainer to understand what it does (especially if one never worked with AWK before). Any ideas on how to make it shorter or more explicit?

Thanks!

** EDIT **

AWK variable line is filtered during the awk processing of the input file and I want to trim extra spaces left after that.

CodePudding user response:

I'm on @DavidC.Rankin's comment's path with:

$ awk  -v line="    foo    bar  " '
BEGIN {
    $0=line
    for(i=1;i<=NF;i  )
        printf "%s%s",$i,(i==NF?ORS:OFS)
}'

Output:

foo bar

CodePudding user response:

Another option using gsub() as you began to do can be done as:

awk '{gsub(/[ ][ ] /," "); gsub(/^[ ]/,"")}1' <<< "    foo    bar  "

Where the first call to gsub() consolidates all multiple spaces to a single space before/between the fields. The second gsub(/^[ ]/,"") just trims the single space that remains at the front of the string.

Either approach works well. Depending on your actual data and your FS value, there may be a preference for one over the other, but without knowing more, they are pretty much a wash.

Example Use/Output

$ awk '{gsub(/[ ][ ] /," "); gsub(/^[ ]/,"")}1' <<< "    foo    bar  "
foo bar

CodePudding user response:

With your shown samples, please try following awk program. Since you are having an awk variable and you are NOT reading any Input_file then we need NOT to use END block we could actually use BEGIN block itself in awk program to read variable.

In this awk program I am creating awk variable named line and in BEGIN section of this program I am globally substituting starting and ending spaces with NULL in line THEN globally substituting all occurrences of spaces(1 or more) with OFS(which is a single space itself) in variable line, then printing its value.

awk -v line="    foo    bar  " '
BEGIN{
  gsub(/^[[:space:]] |[[:space:]] $/,"",line)
  gsub(/[[:space:]] /,OFS,line)
  print line
}
'

OR Considering you have other functions/tasks/work happening in your awk program and you want to do trimming of variable in END section only then try following

awk -v line="    foo    bar  " '
END{
  gsub(/^[[:space:]] |[[:space:]] $/,"",line)
  gsub(/[[:space:]] /,OFS,line)
  print line
}
'  Input_file
  • Related