I'm trying to create a single regex for printing lines between two patterns. I need it to be non-greedy, though.
Below is the dummy C# input file.
/// <summary>
/// This is a comment
/// </summary>
[Fact]
[Trait("a", "b")]
[InlineData(cd)]
public async Task TestOne()
{
\\ lines of code
}
[Theory]
[Trait("ab", "cd")]
[Trait("ef", "ghi")]
[InlineData(jkl)]
[InlineData(mnop)]
public async Task TestTwo(string hello)
{
\\ lines of code
}
/// <summary>
/// This is a comment
/// </summary>
[Theory]
[Trait("ab", "cd")]
[Trait("ef", "ghi")]
public async Task TestThree(bye)
{
\\ lines of code
}
/// <summary>
/// This is a comment
/// </summary>
[Fact]
[Trait("ab", "cd")]
[Trait("ef", "ghi")]
[InlineData(jkl)]
[InlineData(mnop)]
public async Task TestFour(string hello)
{
\\ lines of code
}
[Theory]
public async Task TestFive()
{
\\ lines of code
}
What I want to be printed are the lines between "TestTwo()" and its nearest preceding "[Fact]" or "[Theory]". And ONLY those lines. That is, I want the following printed:
[Theory]
[Trait("ab", "cd")]
[Trait("ef", "ghi")]
[InlineData(jkl)]
[InlineData(mnop)]
public async Task TestTwo(string hello)
I also do not want to match for a specific number of lines, as the line count will fluctuate. Test() may or may not have parameters.
I'm implementing this using bash script, so any single liner bash method would also help. A simple regex would also be more than sufficient.
I've tried alot of things and achieving this is kind of a compromise for what I actually wanted to achieve using a one liner regex.
Do give a look at the following question as well if you can!
How to capture all matching groups between 2 particular patterns?
Thanks in advance!
CodePudding user response:
If blank line is not always always present between each text block then following gnu-awk
solution should work:
awk -v RS='\\[(Fact|Theory)]\n' '
match($0, /(.* TestTwo *\([^\n] ). /, a) {
print hdr a[1]
}
{hdr = RT}
' file.cpp
[Theory]
[Trait("ab", "cd")]
[Trait("ef", "ghi")]
[InlineData(jkl)]
[InlineData(mnop)]
public async Task TestTwo(string hello)
Here:
-v RS='\\[(Fact|Theory)]\n'
sets record separator as[Fact]
or[Theory]
and line breakRT
contains the text matched byRS
regexmatch
function uses a capture group to match what we need to keep
Here is a gnu-grep
solution to do the same:
grep -oPz '(?sm)(?>^//[^\n] \n)*\[(?>Fact|Theory)]\n(?>(?!\[(?>Fact|Theory)]\n).)*TestTwo[^\n] \n' file.cpp
CodePudding user response:
One awk
idea:
awk '
output { output=output ORS $0 } # if we have something in variable "output" then add current line
/\[Fact]|\[Theory]/ { output=$0 } # (re)set "output" to current line
/ TestTwo\(/ { if (output) # if "output" is not empty then ...
print output # print to stdout and ...
output="" # clear
}
' input
This generates:
[Theory]
[Trait("ab", "cd")]
[Trait("ef", "ghi")]
[InlineData(jkl)]
[InlineData(mnop)]
public async Task TestTwo(string hello)