I created a file like this
echo "test 1", Hello, foo, bar, world, "test 2" > test.txt
and the result is this:
test 1
Hello
foo
bar
a better world
test 2
I need to remove all the text starting with the keyword "Hello" and ending with "world", including both keywords.
Something like this
test 1
test 2
I tried
$pattern='(?s)(?<=/Hello/\r?\n).*?(?=world)'
(Get-Content -Path .\test.txt -Raw) -replace $pattern, "" | Set-Content -Path .\test.txt
but nothing happend. What can I try?
CodePudding user response:
Assuming you want to remove the starting and ending keywords you could use either (?s)\s*Hello.*world
or (?s)\s*Hello.*?world
depending on if you want .*
to be greedy or lazy.
(Get-Content path\to\file.txt -Raw) -replace '(?s)\s*Hello.*world' |
Set-Content path\to\result.txt
Use -creplace
for case sensitive matching of the keywords.
CodePudding user response:
Leaving aside that there are extraneous /
in your regex, reformulate it as follows:Tip of the hat to Santiago Squarzon.
$pattern = '(?sm)^Hello\r?\n.*?world\r?\n'
(Get-Content -Path .\test.txt -Raw) -replace $pattern |
Set-Content -Path .\test.txt
This removes the line starting with Hello
all the way through the (first) subsequent line that ends in world
, including the next newline.
This yields the desired output, as shown in your question.
As for what you tried:
Aside from the extraneous /
chars., your primary problem is that you are using look-around assertions ((?<=...)
, (?=...)
), which cause what they match not to be captured as part of the overall match, and are therefore not replaced by -replace
.
CodePudding user response:
I think this is a duplicate with How can I deleted lines from a certain position? or any of the included other duplicates:
'test1', 'Hello', 'foo', 'bar', 'world', 'test2' |SelectString -From '(?=Hello)' -To '(?<=world)'