How to grep multi line string with new line characters or tab characters or spaces-CodePudding

My test file has text like:

> cat test.txt
new dummy("test1", random1).foo("bar1");
new dummy("
        test2", random2);
new dummy("test3", random3).foo("bar3");
new dummy = dummy(
            "test4", random4).foo("bar4");

I am trying to match all single lines ending with semicolon (;) and having text "dummy(". Then I need to extract the string present in the double quotes inside dummy. I have come up with the following command, but it matches only the first and third statement.

> perl -ne 'print if /dummy/ .. /;/' test.txt | grep -oP 'dummy\((.|\n)*,'
dummy("test1",
dummy("test3",

With -o flag I expected to extract string between the double quotes inside dummy. But that is also not working. Can you please give me an idea on how to proceed?

Expected output is:

test1
test2
test3
test4

I referred to following SO links:

How to give a pattern for new line in grep?

how to grep multiple lines until ; (semicolon)

CodePudding user response：

@TLP was pretty close:

perl -0777 -nE 'say for map {s/^\s |\s $//gr} /\bdummy\("(. ?)"/gs' test.txt

test1
test2

Using

-0777 to slurp the file in as a single string
/\bdummy\("(. ?)"/gs finds all the quoted string content after "dummy("
- the s flag allows . to match newlines.
- any string containing escaped double quotes will break this regex
map {s/^\s |\s $//gr} trims leading/trailing whitespace from each string.

CodePudding user response：

Given:

$ cat file
new dummy("test1", random1).foo("bar1");
new dummy("
        test2", random2);
new dummy("test3", random3).foo("bar3");
new dummy = dummy(
            "test4", random4).foo("bar4");

You can use GNU grep this way:

 $ grep -ozP '[^;]*\bdummy[^";]*"\s*\K[^";]*[^;]*;' file | tr '\000' '\n' | grep -oP '^[^"]*'
test1
test2
test3
test4

CodePudding user response：

This perl should work:

perl -0777 -pe 's/(?m)^[^(]* dummy\(\s*"\s*([^"] ).*/$1/g' file

test1
test2
test3
test4