Home > Software design >  compare with one of the 5 options
compare with one of the 5 options

Time:09-12

I am trying to find words where second or third character is one of "aeiou".

# cat t.txt
test
platform
axis
welcome
option

I tried this but the word "platform" and "axis" is missing in the output.

# awk 'substr($0,2,1) == "e" {print $0}' t.txt
test
welcome

CodePudding user response:

You may use this awk solution that matches 1 or 2 of any char followed by one of the vowels:

awk '/^.{1,2}[aeiou]/' file

test
platform
axis
welcome

Or else use substr function to get a substring of 2nd and 3rd char and then compare with one of the vowels:

awk 'substr($0,2,2) ~ /[aeiou]/ ' file

test
platform
axis
welcome

As per comment below OP wants to get string without vowels in 2nd or 3rd position, Here is a solution for that:

awk '{
s=substr($0,2,2)
gsub(/[aeiou] /, "", s)
print substr($0,1,1) s substr($0, 4)
}' file

tst
pltform
axs
wlcome
option

PS: This sed would be shorter for replacement:

sed -E 's/^(.[^aeiou]{0,1})[aeiou]{1,2}/\1/' file

CodePudding user response:

With your shown samples only, please try following awk code. Written and tested in GNU awk, should work in any awk. Simple explanation would be, setting field separator as NULL and checking if 2nd OR 3rd field(character in current line basically) is any of a e i o u then print that line.

awk -v FS="" '$2~/^[aeiou]$/ || $3~/^[aeiou]$/'  Input_file

CodePudding user response:

I would harness GNU AWK for this task following way, let file.txt content be

test
platform
axis
welcome
option

then

awk 'BEGIN{FPAT="."}index("aeiou", $2)||index("aeiou", $3)' file.txt

gives output

test
platform
axis
welcome

Explanation: I inform GNU AWK that field is any single character (.) using FPAT, then I filter lines using index function, if 2nd field that is 2nd character is anywhere inside aeiou then index returns value greater than zero which is treated as true in boolean context and apply same function for 3rd field that is 3rd character and then apply logical OR (||) to their effects.

(tested in gawk 4.2.1)

CodePudding user response:

This might work for you (GNU sed):

sed '/\<..\?[aeiou]/I!d' file

If the second or third character from the start of a word boundary is either a,e,i,o or u (of any case) don't delete the line.

  • Related