Home > Net >  Extracting string value with multiple space in .sh
Extracting string value with multiple space in .sh

Time:09-17

I have file and content like

"SOME WORDS", "AB@@ 9897 7437 8788 8234 78","SOME WORDS",
"AB@@ 9897 7437 8788 8236 79"

How we can getall 'xxxx xxxx xxxx xxxx xxxx xx' matching pattern in .sh?

for fetchData in `grep -o 'AB*\s[0-9]*\s[0-9]*\s[0-9]*\s[0-9]*\s[0-9]*' FILE_NAME`
do
echo 'Data = '${fetchData}
done

Its printing

AB@@ 
9897 
...
..

But i want to print

'AB@@ 9897 7437 8788 8234 78'
'AB@@ 9897 7437 8788 8236 79'

CodePudding user response:

With GNU grep. Using the -o flag.

grep -o 'AB@@.*7[89]' file.txt

If there are only numeric strings in between.

grep -o 'AB@@[0-9 ]*7[89]' file.txt

As per OP's comment. With the -E flag for ERE.

grep -Eo 'AB@@ [[:digit:]]{4} [[:digit:]]{4} [[:digit:]]{4} [[:digit:]]{4} [[:digit:]]{2}' file.txt

with the -P flag for PCRE.

grep -Po 'AB@@ \d{4} \d{4} \d{4} \d{4} \d{2}' file.txt

By default grep is using BRE

CodePudding user response:

$ grep -o '".... .... .... .... .... .."' file.txt | tr -d \"
AB@@ 9897 7437 8788 8234 78
AB@@ 9897 7437 8788 8236 79

CodePudding user response:

When I try your example, grep does not produce any output at all. The reason is that \s has not special meaning in a simple regexp.

You did not describe your string pattern exactly. Assuming that we can any sequence of characters which contains neither double quote, nor comma, as string to be returned, the following example works:

x='"SOME WORDS", "AB@@ 9897 7437 8788 8234 78","SOME WORDS","AB@@ 9897 7437 8788 8236 79"'
for fetchData in $(grep -o '[^",]*' <<<$x)
do
  if [[ $fetchData == *[^ ]* ]]
  then
    echo Found string: "$fetchData"
    # Process your data here
  else
    : # Ignored because only spaces
  fi
done

If you really need one day \s for representing a space, use the -P option.

CodePudding user response:

As per your query, can you please try below and check if it works:

grep -Eo '[a-zA-Z0-9&._-@]{4}\s [[:digit:]]{4}\s [[:digit:]]{4}\s [[:digit:]]{4}\s [[:digit:]]{4}\s [[:digit:]]{2}' file

Output:

AB@@ 9897 7437 8788 8234 78
AB@@ 9897 7437 8788 8236 79
  • Related