I am working with powershell to read in a file. See sample content in the file.
This is my file with content
-- #Start
This is more content
across different lines
etc etc
-- #End
I am using this code to read in file to a variable.
$content = Get-Content "Myfile.txt";
I then use this code to strip a particular section from the file and based on opening and closing tag.
$stringBuilder = New-Object System.Text.StringBuilder;
$pattern = "-- #Start(.*?)-- #End";
$matched = [regex]::match($content, $pattern).Groups[1].Value;
$stringBuilder.AppendLine($matched.Trim());
$stringBuilder.ToString() | Out-File "Newfile.txt" -Encoding utf8;
The problem that I have is in the file I write to, the formatting is not maintained. So what I want is:
This is more content
across different lines
etc etc
But what I am getting is:
This is more content across different lines etc etc
Any ideas how I can alter my code so that in the outputted file the structures is maintained (multiple lines)?
CodePudding user response:
This regex might do what you're looking for, don't see a point on using a StringBuilder
in this case. Do note, since this is a multi-line regex pattern you need to use the -Raw
switch to read your file's content.
$re = [regex] '(?ms)(?<=^-- #Start\s*\r?\n). ?(?=^-- #End)'
$re.Match((Get-Content path\to\Myfile.txt -Raw)).Value |
Set-Content path\to\newFile.txt -NoNewLine
See https://regex101.com/r/82HJxf/1 for details.
If you want to do line-by-line processing, you could use a switch
to read and process the lines of interest. This is particularly useful if the file is very big and doesn't fit in memory.
& {
$capture = $false
switch -Rege -File path\to\Myfile.txt {
'^-- #Start' { $capture = $true }
'^-- #End' { $capture = $false }
Default { if($capture) { $_ } }
}
} | Set-Content path\to\newFile.txt
If there is only one appearance of the opening and closing tag, you could even break
the switch as soon as it encounters the closing tag to stop processing:
'^-- #End' { break }