I have a working script in PowerShell:
$file = Get-Content -Path HKEY_USERS.txt -Raw
foreach($line in [System.IO.File]::ReadLines("EXCLUDE_HKEY_USERS.txt"))
{
$escapedLine = [Regex]::Escape($line)
$pattern = $("(?sm)^$escapedLine.*?(?=^\[HKEY)")
$file -replace $pattern, ' ' | Set-Content HKEY_USERS-filtered.txt
$file = Get-Content -Path HKEY_USERS-filtered.txt -Raw
}
For each line in EXCLUDE_HKEY_USERS.txt
it is performing some changes in file HKEY_USERS.txt
. So with every loop iteration it is writing to this file and re-reading the same file to pull the changes. However, Get-Content
is notorious for memory leaks, so I wanted to refactor it to StreamReader
and StreamWriter
, but I'm a having a hard time to make it work.
As soon as I do:
$filePath = 'HKEY_USERS-filtered.txt';
$sr = New-Object IO.StreamReader($filePath);
$sw = New-Object IO.StreamWriter($filePath);
I get:
New-Object : Exception calling ".ctor" with "1" argument(s): "The process cannot access the file
'HKEY_USERS-filtered.txt' because it is being used by another process."
So it looks like I cannot use StreamReader and StreamWriter on same file simultaneously. Or can I?
CodePudding user response:
tl;dr
Get-Content -Raw
reads a file as a whole and is fast and consumes little unwanted memory.[System.IO.File]::ReadLines()
is a faster and more memory-efficient alternative to line-by-line reading withGet-Content
(without-Raw
), but you need to ensure that the input file is passed as a full path, because .NET's working directory usually differs from PowerShell's.Convert-Path
resolves a given relative path to a full, file-system-native one.A PowerShell-native alternative to using
[System.IO.File]::ReadLines()
is theswitch
statement with the-File
parameter, which performs similarly well while avoiding the working-directory discrepancy pitfall, and offers additional features.
There is no need to save the modified file content to disk after each iteration - just update the
$file
variable, and, after exiting the loop, save the value of$file
to the output file.
$fileContent = Get-Content -Path HKEY_USERS.txt -Raw
# Be sure to specify a *full* path.
$excludeFile = Convert-Path -LiteralPath 'EXCLUDE_HKEY_USERS.txt'
foreach($line in [System.IO.File]::ReadLines($excludeFile)) {
$escapedLine = [Regex]::Escape($line)
$pattern = "(?sm)^$escapedLine.*?(?=^\[HKEY)"
# Modify the content and save the result back to variable $fileContent
$fileContent = $fileContent -replace $pattern, ' '
}
# After all modifications have been performed, save to the output file
$fileContent | Set-Content HKEY_USERS-filtered.txt
Building on Santiago Squarzon's helpful comments:
Get-Content
does not cause memory leaks, but it can consume a lot of memory that isn't garbage-collected until an unpredictable later point in time.- The reason is that - unless the
-Raw
switch is used - it decorates each line read with PowerShell ETS (Extended Type System) properties containing metadata about the file of origin, such as its path (.PSPath
) and the line number (.ReadCount
). - This both consumes extra memory and slows the command down - GitHub issue #7537 asks for a way to opt out of this wasteful decoration, as it typically isn't needed.
- However, reading with
-Raw
is efficient, because the entire file content is read into a single, multi-line string, which means that the decoration is only performed once.
- The reason is that - unless the
So it looks like I cannot use StreamReader and StreamWriter on same file simultaneously. Or can I?
No, you cannot. You cannot simultaneously read from a file and overwrite it.
To update / replace an existing file you have two options (note that, for a fully robust solution, all attributes of the original file (except the last write time and size) should be retained, which requires extra work):
Read the old content into memory in full, perform the desired modification in memory, then write the modified content back to the original file, as shown in the top section.
- There is a slight risk of data loss, however, namely if the process of writing back to the file gets interrupted.
More safely, write the modified content to a temporary file and, upon successful completion, replace the original file with the temporary one.