Home > Software engineering >  Why can't I echo into a regex-matched output file
Why can't I echo into a regex-matched output file

Time:02-10

I have a file that looks like this:

filename.txt

first line
prefix=C:\Users\me\tmp\out_file.txt
second line
python args

What I'm trying to do is run the python script and then echo the output into the file C:\Users\me\tmp\out_file.txt. Running the script is very easy, but echoing the output is proving difficult. filename.txt is in C:\Users\me\tmp, so I open a powershell terminal there and execute the following:

Get-ChildItem | Get-Content -Raw | Select-String -Pattern "(?smi)(?:prefix=([^\n]*?))$.*?(python .*?)$" | %{echo "hello" > $_.Matches.Groups[1].Value}

What this is doing is matching the python script and the output file (note that in general I will be executing this script for every file, hence the pipes). But this results in the error:

Out-File: The filename, directory name, or volume label syntax is incorrect. : 'C:\Users\me\tmp\out_file.txt'

Now, If I copy and paste that exact filename:

echo "hello" > 'C:\Users\me\tmp\out_file.txt'

The result works perfectly (it also works without the quotes). So my question is, why does this fail in the ForEach-Object loop?

In the end, I will call

ForEach-Object -Parallel {Invoke-Expression $_.Matches.Groups[2].Value > $_.Matches.Groups[1].Value}

Which (hopefully) will pipe the output into the output file, but if there is a better way of achieving this that would be helpful as well.

CodePudding user response:

The issue is that you are capturing the CR (Carriage Return) character at the end of the line. If you look at $_.Matches.Groups[1] you'll see:

Success  : True
Name     : 1
Captures : {1}
Index    : 19
Length   : 29
Value    : C:\Users\me\tmp\out_file.txt

Note the length of 29, but the file path is only 28 characters. If you do:

Get-ChildItem | Get-Content -Raw | Select-String -Pattern "(?smi)(?:prefix=([^\n]*?))$.*?(python .*?)$" | %{$_.Matches.Groups[1].Value.ToCharArray()|%{">$_<"}}

You should see that the last few entries appear as:

>e<
>.<
>t<
>x<
>t<
><

Indicating that there's an invisible character there. You can fix that by altering your regex just a little:

"(?smi)(?:prefix=(.*?))[\r\n] .*?(python .*?)$"
  • Related