Home > Enterprise >  How to grep multi line string with new line characters or tab characters or spaces
How to grep multi line string with new line characters or tab characters or spaces

Time:04-15

My test file has text like:

> cat test.txt
new dummy("test1", random1).foo("bar1");
new dummy("
        test2", random2);
new dummy("test3", random3).foo("bar3");
new dummy = dummy(
            "test4", random4).foo("bar4");

I am trying to match all single lines ending with semicolon (;) and having text "dummy(". Then I need to extract the string present in the double quotes inside dummy. I have come up with the following command, but it matches only the first and third statement.

> perl -ne 'print if /dummy/ .. /;/' test.txt | grep -oP 'dummy\((.|\n)*,'
dummy("test1",
dummy("test3",

With -o flag I expected to extract string between the double quotes inside dummy. But that is also not working. Can you please give me an idea on how to proceed?

Expected output is:

test1
test2
test3
test4

I referred to following SO links:

How to give a pattern for new line in grep?

how to grep multiple lines until ; (semicolon)

CodePudding user response:

@TLP was pretty close:

perl -0777 -nE 'say for map {s/^\s |\s $//gr} /\bdummy\("(. ?)"/gs' test.txt
test1
test2

Using

  • -0777 to slurp the file in as a single string
  • /\bdummy\("(. ?)"/gs finds all the quoted string content after "dummy("
    • the s flag allows . to match newlines.
    • any string containing escaped double quotes will break this regex
  • map {s/^\s |\s $//gr} trims leading/trailing whitespace from each string.

CodePudding user response:

Given:

$ cat file
new dummy("test1", random1).foo("bar1");
new dummy("
        test2", random2);
new dummy("test3", random3).foo("bar3");
new dummy = dummy(
            "test4", random4).foo("bar4");

You can use GNU grep this way:

 $ grep -ozP '[^;]*\bdummy[^";]*"\s*\K[^";]*[^;]*;' file | tr '\000' '\n' | grep -oP '^[^"]*'
test1
test2
test3
test4

CodePudding user response:

This perl should work:

perl -0777 -pe 's/(?m)^[^(]* dummy\(\s*"\s*([^"] ).*/$1/g' file

test1
test2
test3
test4
  • Related