Home > Net >  How to split a text file into two in PowerShell?
How to split a text file into two in PowerShell?

Time:02-25

I have one text file with Script that I want to split into two

Below is the dummy script

--serverone

this is first part of my script


--servertwo


this is second part of my script

I want to create two text files that would look like

file1

--serverone

this is first part of my script

file2

--servertwo


this is second part of my script

So far, I have added a special character within the script that I know don't exist ("}")

$script = get-content -Path "C:\Users\shamvil\Desktop\test.txt"
$newscript = $script.Replace("--servertwo","}--servertwo")
$newscript.split("}")

but I don't know how to save the split into two separate places.

This might not be a best approach, so I am also open to different solution as well.

Please help, thanks!

CodePudding user response:

Use a regex-based -split operation:

$i = 0
(Get-Content -Raw test.txt) -split '(?m)^(?=--)' -ne '' |
  ForEach-Object { $fileName = 'file'   (  $i); Set-Content $fileName $_ }
  • This assumes that each block of lines that starts with a line that starts with -- is to be saved to a separate file.

  • Get-Content -Raw reads the entire file into a single, multi-line string.

  • As for the separator regex passed to -split:

    • The (?m) inline regex option makes anchors ^ and $ match on each line
    • ^(?=--) therefore matches every line that starts with --, using a by definition non-capturing look-ahead assertion ((?=...)) to ensure that the -- isn't removed from the resulting blocks (by default, what matches the separator regex is not included).
  • -ne '' filters out the extra empty element that results from the separator expression matching at the very start of the string.

  • Note that Set-Content knows nothing about the character encoding of the input file and uses its default encoding; use -Encoding as needed.


zett42 points out that the file-writing part can be streamlined with the help of a delay-bind script-block parameter:

$i = 0
(Get-Content -Raw test.txt) -split '(?m)^(?=--)' -ne '' |
  Set-Content -LiteralPath { (Get-Variable i -Scope 1).Value  ; "file$i" }
  • The Get-Variable call to access and increment the $i variable in the parent scope is necessary, because delay-bind script blocks (as well as script blocks for calculated properties) run in a child scope - perhaps surprisingly, as discusssed in GitHub issue #7157

    • A shorter - but even more obscure - option is to use ([ref] $i).Value instead; see this answer for details.
  • zett42 also points to a proposed future enhancement that would obviate the need to maintain the sequence numbers manually, via the introduction of an automatic $PSIndex variable that reflects the sequence number of the current pipeline object: see GitHub issue #13772.

  • Related