Home > Software engineering >  How do i get the value present in first double quotes?
How do i get the value present in first double quotes?

Time:12-11

I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this -

name


things: "water bottle","40","new phone cover",10



place

I just need to return the value in first double quotes.

water bottle

The value in first double quotes can be one word/two words. That is, water bottle can be sometimes replaced with pen. I tried -

awk '/:/ {print $2}'

But this just gives

water

I wanted to comma separate it, but there's colon(:) after things. So, I'm not sure how to separate it. How do i get the value present in first double quotes?

EDIT:

SOLUTION: I used the below code since I particularly wanted to use awk -

awk '/:/' test.txt | cut -d\" -f2

CodePudding user response:

A solution using the cut utility could be

cut -d\" -f2 infile > outfile

CodePudding user response:

Using gnu awk you could make use of a capture group, and use a negated character class to not cross the , as that is the field delimiter.

awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file

Output

water bottle

The pattern matches

  • ^ Start of string
  • [^",:]*:Optionally match any value except " and , and :, then match :
  • [^",]* Optionally match any value except " and ,
  • "([^"]*)" Capture in group 1 the value between double quotes

If the value is always between double quotes, a short option to get the desired result could be setting the field separator to " and check if group 1 contains a colon, although technically you can also get water bottle if there is only a leading double quote and not closing one.

awk -F'"' '$1 ~ /:/ {print $2}' file

CodePudding user response:

You can use sed:

sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' file > outfile

See the online demo:

#!/bin/bash
s='name
 
 
things: "water bottle","40","new phone cover",10
 
 
 
place'
 
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' <<< "$s"
# => water bottle

The command means

  • -n - the option suppresses the default line output
  • ^[^"]*"\([^"]*\)".* - a POSIX BRE regex pattern that matches
    • ^ - start of string
    • [^"]* - zero or more chars other than "
    • " - a " char
    • \([^"]*\) - Group 1 (\1 refers to this value): any zero or more chars other than "
    • ".* - a " char and the rest of the string.
  • \1 replaces the match with Group 1 value
  • p - only prints the result of a successful substitution.
  • Related