Home > Software design >  Parse multiple text files line by line but only the lines before a specific match
Parse multiple text files line by line but only the lines before a specific match

Time:04-19

I'm working on a large number of log files and I would like to parse all the lines before a specific match in each file.

Files look like this :

04/14/2022 02:07:19 SLK 1234 12345 86177500 ERROR - UPDSERVS: SERVICE :SOFTWARESERVICE - IS NOT INSTALLED
04/14/2022 02:07:58 SLK 1234 12345 86216625 ERROR - Bin File Creation and Dump Raw Output Disabled
04/14/2022 02:08:01 SLK 1234 4321 86219734 BADERROR(3328:1416) Default PROCESS TIMING THRESHOLD 10
04/14/2022 02:08:08 SLK 1234 4321 86226078 ETHERNET(4264:5628) USER 1 Cluster 1 ID 00EQ3G038651
04/14/2022 02:08:08 SLK 1234 4321 86226078 ETHERNET(4264:5628) USER 2 CLuster 2 ID 00EQ3G026434
00000000-0000-0000-0000-000000000664 Error 2022-04-13T02:09:07 02:00 LegacyErrorLog SYSERR(4320:4404) ERRORLOG.CPP(606) sscmd(5812:1) KEYINPUT.CPP(209) Command: MANUAL ROLLOVER
00000000-0000-0000-0000-000000000649 Error 2022-04-13T02:09:07 02:00 LegacyErrorLog SYSERR(4320:4404) ERRORLOG.CPP(606) Closed the current active Error Log due to: Manual rollover.

And I want all the n lines (from all the files) before the pattern "Command: MANUAL ROLLOVER" excluding that line, then parse the data in an Excel spreadsheet like this :

Client ID ||     Date     ||   Time    || Error code|| Detail
SLK-1234      04/14/2022     02:07:19     12345        86177500 ERROR - UPDSERVS: SERVICE :SOFTWARESERVICE - IS NOT INSTALLED
SLK-1234      04/14/2022     02:07:19     12345        86216625 ERROR - Bin File Creation and Dump Raw Output Disabled
SLK-1234      04/14/2022     02:07:19      4321        86219734 BADERROR(3328:1416) Default PROCESS TIMING THRESHOLD 10

I "think" I got the first part correct with this code :

$move = "X:\New\test\Output"
$root = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter *.*

$Results = foreach( $File in $Files ){
$location = $root "\" $file

    $s = Select-String -Path "$location" -Pattern "Command: ROLLOVER ERRORLOG" -Context ?,0 |
        Foreach-Object { $_.Line,$_.Context.PreContext[0].Trim()}

But I've rarely extracted to Excel so I don't know How I could achieve the rest :/.

CodePudding user response:

I would use switch to handle the files line-by-line and use a (quite long) regex to parse out the data into objects.
Once you have gone though the log file and gathered the data as objects, you can use Export-Csv to write it all out in structured CSV files you can simply open in Excel.

$move  = "X:\New\test\Output"
$root  = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter '*.log' -File

$Results = foreach($file in $files) {
    $data = switch -Regex -File $file.FullName {
        # exit the switch if we reach a line with 'Command: MANUAL ROLLOVER'
        'Command: MANUAL ROLLOVER' { break }
        # parse the string into named matches
        '^(?<date>\d{2}/\d{2}/\d{4})\s (?<time>\d{2}:\d{2}:\d{2})\s (?<idleft>[^\s] )\s (?<idright>\d )\s (?<error>\d )\s (?<details>(. ))$' {
            # output an object with the wanted properties
            [PsCustomObject]@{
                'Client ID'  = '{0}-{1}' -f $matches['idleft'], $matches['idright']
                'Date'       = $matches['date']
                'Time'       = $matches['time']
                'Error code' = $matches['error']
                'Detail'     = $matches['details']
            }
        }
    }
    $target = Join-Path -Path $move -ChildPath $file.Name
    # export the gathered data to a csv you can double-click to open in Excel
    $data | Export-Csv -Path $target -UseCulture -NoTypeInformation
} 

Using your example log, this would open in Excel as:

enter image description here


From your comment I understand you want to parse the data from all the files and save it in one large csv file in the same folder. For that do:

$root  = "X:\New\test"
$files = Get-ChildItem -Path $root -Filter '*.log' -File

$Results = foreach($file in $files) {
    switch -Regex -File $file.FullName {
        # exit the switch if we reach a line with 'Command: MANUAL ROLLOVER'
        'Command: MANUAL ROLLOVER' { break }
        # parse the string into named matches
        '^(?<date>\d{2}/\d{2}/\d{4})\s (?<time>\d{2}:\d{2}:\d{2})\s (?<idleft>[^\s] )\s (?<idright>\d )\s (?<error>\d )\s (?<details>(. ))$' {
            # output an object with the wanted properties
            [PsCustomObject]@{
                'Client ID'  = '{0}-{1}' -f $matches['idleft'], $matches['idright']
                'Date'       = $matches['date']
                'Time'       = $matches['time']
                'Error code' = $matches['error']
                'Detail'     = $matches['details']
            }
        }
    }
} 
$target = Join-Path -Path $root -ChildPath 'LogResults.csv'
# export the gathered data to a csv you can double-click to open in Excel
$Results | Export-Csv -Path $target -UseCulture -NoTypeInformation
  • Related