Strip blanks at the beginning & trailing blanks in AWK-CodePudding

I am asking for your assistance to strip blanks/spaces before & at the end each field. ie Remove the trailing space from the $1, the same apply to the beginning & trailing spaces in $2, and the leading spaces from $3 using AWK on AIX 7.2 platform. Below is some data in the file Employee.txt

001 |  George John Aden Brown   | gbrown
002 |   Barry Street White      | bwhite
003 |    Kelly Jones            | kjones
004 |   Jolene Davidson Smith   | jsmith

My objective is to achieve the following set of data (without the leading/trailing spaces)

001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

I have tried the following without satisfaction.

awk -F"|" '{ print $1 "|" gsub(" ", "", $2) "|" $3 }' Employee.txt
awk -F"|" '{ print $1 "|" gsub(/[ \t]/,"",$2) "|" $3 }'  Employee.txt
awk -F"|" '{ print $1 "|" gsub(/[[:blank:]]/, "", $2) "|" $3 }' Employee.txt

001 |8| gbrown
002 |11| bwhite
003 |17| kjones
004 |8| jsmith

Many thanks, George

CodePudding user response：

With your shown samples, please try following awk code. Written and tested in GNU awk, should work in any awk. Simple explanation would be, setting field separator as [[:space:]] \\|[[:space:]] (spaces followed by pipe followed by spaces) for all the lines of Input_file then setting OFS as | for all the lines. In main program then resetting $1 to itself to actually apply new value of OFS to whole line, once its done, simple printing that line by mentioning 1.

awk -v FS='[[:space:]] \\|[[:space:]] ' -v OFS='|' '{$1=$1} 1'  Input_file

CodePudding user response：

I usually - and a LOT:

$ awk '
BEGIN {
    FS=OFS="|"                 # set both separators to pipe
}
{
    for(i=1;i<=NF;i  )         # loop all fields
        gsub(/^  |  $/,"",$i)  # strip leading and trailing space
}1' file                       # output

Output:

001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

If you got other junk there, feel free to tune the regex:

gsub(/^"?[ \t]*(N\/A)?|[ \t]*"?$/),"",$i)  # etc

CodePudding user response：

You've got good awk answers. However if you want to consider sed this is pretty simple with:

sed -E 's/ *(\|) *|^  |  $/\1/g' file

001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

Or else with gnu-awk:

awk '{print gensub(/ *(\|) *|^  |  $/, "\\1", "g")}' file

PS: This sed command requires GNU or BSD versions.

CodePudding user response：

If, as you said in your question, you don't want leading or trailing spaces on the lines removed then using any sed:

$ sed 's/ *| */|/g' file
001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

otherwise if you actually did want the leading/trailing blanks removed too then with GNU or BSD sed for -E:

$ sed -E 's/(^| *)\|( *|$)/|/g' file
001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

CodePudding user response：

Also with awk

awk '{$2=$2;gsub(/ \| /,"|")} 1' file
001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

The stripping of leading and trailing whitespace also comes into play whenever $0 is recomputed. (see: http://gnu.ist.utl.pt/software/gawk/manual/html_node/Regexp-Field-Splitting.html )
$2=$2 The assignment of $2 to $2 rebuilds $0. Now we have a new $0 without leading and trailing whitespace.
And we apply to $0 the gsub() function: regexp / \| / for space followed by | character followed by space. This is substituted by | character.