print the last letter of each word to make a string using `awk` command-CodePudding

I have this line

UDACBG UYAZAM DJSUBU WJKMBC NTCGCH DIDEVO RHWDAS

i am trying to print the last letter of each word to make a string using awk command

awk '{ print substr($1,6) substr($2,6) substr($3,6) substr($4,6) substr($5,6) substr($6,6) }'

In case I don't know how many characters a word contains, what is the correct command to print the last character of $column, and instead of the repeding substr command, how can I use it only once to print specific characters in different columns

CodePudding user response：

If you have just this one single line to handle you can use

awk '{for (i=1;i<=NF;i  ) r = r "" substr($i,length($i))} END{print r}' file

If you have multiple lines in the input:

awk '{r=""; for (i=1;i<=NF;i  ) r = r "" substr($i,length($i)); print r}' file

Details:

{for (i=1;i<=NF;i ) r = r "" substr($i,length($i)) - iterate over all fields in the current record, i is the field ID, $i is the field value, and all last chars of each field (retrieved with substr($i,length($i))) are appended to r variable
END{print r} prints the r variable once awk script finishes processing.
In the second solution, r value is cleared upon each line processing start, and its value is printed after processing all fields in the current record.

See the online demo:

#!/bin/bash
s='UDACBG UYAZAM DJSUBU WJKMBC NTCGCH DIDEVO RHWDAS'
awk '{for (i=1;i<=NF;i  ) r = r "" substr($i,length($1))} END{print r}' <<< "$s"

Output:

GMUCHOS

CodePudding user response：

Using GNU awk and gensub:

$ gawk '{print gensub(/([^ ] )([^ ])( |$)/,"\\2","g")}' file

Output:

GMUCHOS

CodePudding user response：

1st solution: With GNU awk you could try following awk program, written and tested eith shown samples.

awk -v RS='.([[:space:]] |$)' 'RT{gsub(/[[:space:]] /,"",RT);val=val RT} END{print val}' Input_file

Explanation: Set record separator as any character followed by space OR end of value/line. Then as per OP's requirement remove unnecessary newline/spaces from fetched value; keep on creating val which has matched value of RS, finally when awk program is done with reading whole Input_file print the value of variable then.

2nd solution: Using record separator as null and using match function on values to match regex (.[[:space:]] )|(.$) to get last letter values only with each match found, keep adding matched values into a variable and at last in END block of awk program print variable's value.

awk -v RS= '
{
  while(match($0,/(.[[:space:]] )|(.$)/)){
    val=val substr($0,RSTART,RLENGTH)
    $0=substr($0,RSTART RLENGTH)
  }
}
END{
  gsub(/[[:space:]] /,"",val)
  print val
}
'  Input_file

CodePudding user response：

Simple substitutions on individual lines is the job sed exists to do:

$ sed 's/[^ ]*\([^ ]\) */\1/g' file
GMUCHOS

CodePudding user response：

An alternate approach with GNU awk is to use FPAT to split by and keep the content:

gawk 'BEGIN{FPAT="\\S\\>"}
{   s=""
    for (i=1; i<=NF; i  ) s=s $i
    print s
}' file
GMUCHOS

CodePudding user response：

using many tools

$ tr -s ' ' '\n' <file | rev | cut -c1 | paste -sd'\0'

GMUCHOS

separate the words to lines, reverse so that we can pick the first char easily, and finally paste them back together without a delimiter. Not the shortest solution but I think the most trivial one...

CodePudding user response：

I would harness GNU AWK for this as follows, let file.txt content be

UDACBG UYAZAM DJSUBU WJKMBC NTCGCH DIDEVO RHWDAS

then

awk 'BEGIN{FPAT="[[:alpha:]]\\>";OFS=""}{$1=$1;print}' file.txt

output

GMUCHOS

Explanation: Inform AWK to treat any alphabetic character at end of word and use empty string as output field seperator. $1=$1 is used to trigger line rebuilding with usage of specified OFS. If you want to know more about start/end of word read GNU Regexp Operators.

(tested in gawk 4.2.1)

CodePudding user response：

Another solution with GNU awk:

awk '{$0=gensub(/[^[:space:]]*([[:alpha:]])/, "\\1","g"); gsub(/\s/,"")} 1' file
GMUCHOS

gensub() gets here the characters and gsub() removes the spaces between them.