Home > Back-end >  Powershell: Search for a match in position 68-70 then extract field position 2-9
Powershell: Search for a match in position 68-70 then extract field position 2-9

Time:02-24

I need to split a text file that contains 10K lines, some of which contain information about files. Each line of interest contains the string "VER" in position (column) 68-70 and the name of the file - which is the information I'm trying to extract - is found in position 2-9.

It looks like this... the file name is ACCRLINK

 ACCRLINK                         VER     1   176      D                                    03/09/98 02/21/84 

I have a script that will split a file but it is rudimentary and I'm unsure how to change it to fit my new needs. The below script will look for a match on NEWTEXT= and then take the next string after the "=" and make that the file name.

However, this substring-based approach does not work.

Can anyone help me alter the script to select by position and capture the file name in another position?

Thank you,

-Ron

$InputFile = "C:\RECORDS_cpy.txt"
$Reader = New-Object System.IO.StreamReader($InputFile)
#$a = 1
$OPName = @()
While (($Line = $Reader.ReadLine()) -ne $null) {
    If ($Line -match "NEWTEXT=") {
        $OPName = $Line.Split("=")
        $FileName = $OPName[1].Trim()
        Write-Host "Found ... $FileName" -foregroundcolor green
        $OutputFile = "$FileName.txt"
        #$a  
    }    
    Add-Content $OutputFile $Line
}

CodePudding user response:

I suggest using a switch statement, which offers both convenient and fast line-by-line reading of files via -File and regex-matching via -Regex:

& {
  switch -CaseSensitive -Regex -File "C:\RECORDS_cpy.txt" {
    '^.(.{8}).{58}VER' { $Matches[1]   '.txt' }
  }
} | Set-Content $OutputFile

Note that a single Set-Content call is used to write all output produced by the switch statement to a file, which is more efficient that multiple Add-Content calls. If you really meant to append to preexisting $OutputFile content, replace Set-Content with Add-Content above.

  • Per your later feedback, you're looking for all lines that contain the string VER at the (1-based) column position 68 on each line, and, for matching lines only, extract the filename from column positions 2-9 (8 chars. starting in column 2).

  • Note the use of capture group (.{8}) in the regex, which captures the 8 characters assumed to be the file name, and makes the captured text available in index [1] of the automatic $Matches variable.

CodePudding user response:

This code ignores the position issues and looks entirely at the pattern. You need the lines that have "NEWTEXT" with an "=" followed by the desired text and then followed by VER and maybe some other random text.

function GetFileNames([string]$FileName) {
    switch -Regex -File $FileName {
        '^\s*NEWTEXT\s*=\s*(?<File>.*?)\s*VER\s*.*$' {$Matches.File}
        default {continue}
    }
}
$InputFile = "C:\RECORDS_cpy.txt"
$OutFile = "C:\RECORDS_Results.txt"
GetFileNames $InputFile | Out-File $OutFile

When ran, the file C:\RECORDS_Results.txt contains this:

ACCRLINK
  • Related