I want to extract you in the sample string:
See [ "you" later
However, my result is wrong:
awk '{ sub(/.*\"/, ""); sub(/\".*/, ""); print }' <<< "See [ \"you\" later"
result:
later
In awk or other methods, how can I extract the substring in the double quotes?
CodePudding user response:
Here is an awk solution without any regex:
s='See [ "you" later'
awk -F '"' 'NF>2 {print $2}' <<< "$s"
you
Or a sed
solution with regex:
sed -E 's/[^"]*"([^"]*)".*/\1/' <<< "$s"
you
Another awk
with match
:
awk 'match($0, /"[^"]*"/) {print substr($0, RSTART 1, RLENGTH-2)}' <<< "$s"
you
CodePudding user response:
1st solution: You can make use of gsub
function of awk
here. Just simply do 2 substitutions with NULL. 1st till 1st occurrence of "
and then substitute everything from next "
occurrence to everything with NULL and print that line.
awk '{gsub(/^[^"]*"|".*/,"")} 1' Input_file
2nd solution: Using GNU grep
solution. Using its -oP
option to print matched part and enable PCRE regex option respectively. With regex from starting match till very first occurrence of "
and using \K
option to forget matched part and then again match everything just before next occurrence of "
which will print text between 2 "
as per requirement.
grep -oP '^.*?"\K[^"]*' Input_file
CodePudding user response:
You can also use cut
here:
cut -d\" -f 2 <<< 'See [ "you" later '
It splits the string with a double quote and gets the second item.
Output:
you
See the online demo.
CodePudding user response:
Using bash
IFS='"'
read -ra arr <<< "See [ \"you\" later"
echo ${arr[1]}
gives output
you
Explanation: use IFS
to inform bash
to split at "
, read splitted text into array arr
print 2nd element (which is [1]
as [0]
denotes 1st element).
CodePudding user response:
Extract all quoted substrings, and remove the quotes:
echo 'See [ "you" later, "" "a" "b" "c' |
grep -o '"[^"]*"' | tr -d \"
Gives:
you
a
b
""
is matched as an empty string on the second line of output (usegrep -o '"[^"]\ "'
to skip empty strings)"c
is not fully quoted, so it doesn't match
For a small string, you may want to use pure shell. This extracts the first quoted substring in $str
:
str='Example "a" and "b".'
str=${str#*\"} # Cut up to first quote
case $str in
*\"*) str=${str%%\"*};; # Cut from second quote onwards
*) str= # $str contains less than two quotes
esac
echo "$str"
Gives
a
CodePudding user response:
Just a few ways using GNU awk for:
multi-char RS
and RT
:
$ echo 'See [ "you" later' |
awk -v RS='"[^"]*"' 'RT{ print substr(RT,2,length(RT)-2) }'
you
the 3rd arg to match()
:
$ echo 'See [ "you" later' |
awk 'match($0,/"([^"]*)"/,a){ print a[1] }'
you
gensub()
(assuming the quoted string is always present):
$ echo 'See [ "you" later' |
awk '{print gensub(/.*"([^"]*)".*/,"\\1",1)}'
you
FPAT
:
$ echo 'See [ "you" later' |
awk -v FPAT='[^"]*' 'NF>2{print $2}'
you
$ echo 'See [ "you" later' |
awk -v FPAT='"[^"]*"' 'NF{print substr($1,2,length($1)-2)}'
you
patsplit():
$ echo 'See [ "you" later' |
awk 'patsplit($0,f,/"[^"]*"/,s){print substr(f[1],2,length(f[1])-2)}'
you
the 4th arg to split()
:
$ echo 'See [ "you" later' |
awk 'split($0,f,/"[^"]*"/,s)>1{print substr(s[1],2,length(s[1])-2)}'
you
CodePudding user response:
Using sed
$ sed -n 's/[^"]*"\([[:alpha:]]\ \)"[^"]*/\1 /gp' input_file
you
CodePudding user response:
$ grep -oP '(?<=").*(?=")' <<< "See [ \"you\" later"
you