I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this -
name
things: "water bottle","40","new phone cover",10
place
I just need to return the value in first double quotes.
water bottle
The value in first double quotes can be one word/two words. That is, water bottle
can be sometimes replaced with pen
.
I tried -
awk '/:/ {print $2}'
But this just gives
water
I wanted to comma separate it, but there's colon(:)
after things
. So, I'm not sure how to separate it.
How do i get the value present in first double quotes?
EDIT:
SOLUTION: I used the below code since I particularly wanted to use awk -
awk '/:/' test.txt | cut -d\" -f2
CodePudding user response:
A solution using the cut
utility could be
cut -d\" -f2 infile > outfile
CodePudding user response:
Using gnu awk
you could make use of a capture group, and use a negated character class to not cross the ,
as that is the field delimiter.
awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file
Output
water bottle
The pattern matches
^
Start of string[^",:]*:
Optionally match any value except"
and,
and:
, then match:
[^",]*
Optionally match any value except"
and,
"([^"]*)"
Capture in group 1 the value between double quotes
If the value is always between double quotes, a short option to get the desired result could be setting the field separator to "
and check if group 1 contains a colon, although technically you can also get water bottle
if there is only a leading double quote and not closing one.
awk -F'"' '$1 ~ /:/ {print $2}' file
CodePudding user response:
You can use sed
:
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' file > outfile
See the online demo:
#!/bin/bash
s='name
things: "water bottle","40","new phone cover",10
place'
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' <<< "$s"
# => water bottle
The command means
-n
- the option suppresses the default line output^[^"]*"\([^"]*\)".*
- a POSIX BRE regex pattern that matches^
- start of string[^"]*
- zero or more chars other than"
"
- a"
char\([^"]*\)
- Group 1 (\1
refers to this value): any zero or more chars other than"
".*
- a"
char and the rest of the string.
\1
replaces the match with Group 1 valuep
- only prints the result of a successful substitution.