Extract specific value from text on shell script-CodePudding

I am reading a text file to extract some specific information. I was able to solve it using a simple pipeline like:

line='[any] a b "c a" valuewanted k o'
echo $line | cut -d " " -f 6 | sort -u
# prints valuewanted

But I was checking all log text and I see values with another spaces that are breaking my pipeline. As example:

line='[any] a "b 1" "c a" valuewanted k o'
echo $line | cut -d " " -f 6 | sort -u

#prints a"
# must change -f to -f 7

Also I have tried using aws '{print $6}' but happens same error.

I am reading a big file so it's unviable changing position for every single line. Each line has a pattern where each group it's splited by a space. If the content is between double quotes, it's related to same group not different groups like I am parsing in my script.

When a group has some space, entire group value is around double quotes.

Is there anyway to make command cut split by spaces and handle "content whatever" as a single group?

CodePudding user response：

You can use gnu-awk with FPAT:

awk -v FPAT='"[^"]*"|[^[:blank:]] ' -v OFS='|' '
{print $1,$2,$3,$4,$5,$6,$7}' file

[any]|a|b|"c a"|valuewanted|k|o
[any]|a|"b 1"|"c a"|valuewanted|k|o

# input data
cat file

[any] a b "c a" valuewanted k o
[any] a "b 1" "c a" valuewanted k o

I used print $1,$2,$3,$4,$5,$6,$7 to demonstrate all field values. You can change it to whatever you like.

You can use b and "b 1" in 3rd field in both lines.