Home > Back-end >  Regex search and generate output in csv
Regex search and generate output in csv

Time:10-18

I am working on below requirement. I have below input file and trying to extract specific information using regex and create a output csv file with all matches.

Sample file

[2022-05-31T16:56:25.551558-04:00] [XFM] [TRACE:1] [EPMHFM-00000] [XFM] [ecid: XDS.0000.0000.0000.0001] [File: c:\jenkins\workspace\hfm_11.2.8_rue_build\hfm\source\xfmdatasourceroot\xfmdatasource\xfmdatasource.cpp] [Line: 601] [userId: ] [Msg arguments: ] [appName: COMMAPS4] [pid: 8364] [tid: 14160] [host: win2019standard] [nwaddr: [fe80::3d7e:f4de:2f83:8e30%4]:0;10.65.51.209:0;] [errorCode: 0] [srcException: 0] [errType: 1] [dbUpdate: 2] [11.2.8.1.000.9]  [[XDS: XFMDataSource process starting...
 
[2022-08-28T20:04:39.037507-04:00] [XFM] [TRACE:1] [EPMHFM-00000] [XFM] [ecid: ] [File: c:\jenkins\workspace\hfm_11.2.6_build\hfm\source\xfmdatasourceroot\xfmdatasource\xfmdatasource.cpp] [Line: 489] [userId: ] [Msg arguments: ] [appName: COMMAPS4] [pid: 11496] [tid: 9268] [host: Win2019standard] [nwaddr: [fe80::dca:9652:390a:7031]:0;10.199.36.35:0;] [errorCode: 0] [srcException: 0] [errType: 1] [dbUpdate: 2] [11.2.6.0.000.38]  [[XDS: XFMDataSource process exiting now ...

Below is the powershell with regex used to extract desired values.

Get-Content -Path "C:\Users\test.txt"  |select-String -pattern "(\d{4}\-\d{2}\-\d{2})|`(\d\d:\d\d:\d\d|host :\s([a-zA-Z \d] ))|appName :\s([a-zA-Z \d] )|`(11.\d.\d.\d.\d\d\d.\d\d)|pid :\s([a-zA-Z \d] )|XDS :\s([a-zA-Z \d] )"-AllMatches|ForEach-Object {$_.Matches.value} |ForEach-Object {$_.Groups[1].Value}

Sample output:

2022-05-31

16:56:25

appName: COMMAPS4

pid: 8364

host: win2019standard

XDS: XFMDataSource process starting

========

Goal is create CSV file with below format with all the results.

  Date ,      Time  ,  Appname ,  PID ,    Host    ,       Message 

2022-05-31, 16:56:25, COMMAPS4, 8364,win2019standard, XFMDataSource starting

Tried outfile and export-Csv which is not giving desired outputs

$content =Get-Content -Path "C:\Users\test.txt"
$regex = '(\d{4}\-\d{2}\-\d{2})|(\d\d:\d\d:\d\d|host :\s([a-zA-Z \d] ))|appName :\s([a-zA-Z \d] )|(11.\d.\d.\d.\d\d\d.\d\d)|pid :\s([a-zA-Z \d] )|XDS :\s([a-zA-Z \d] )'
[regex]::Matches($content, $regex) | % {  
    [PSCustomObject]@{
            Date = $_.Groups.Value[1]
            Time = $_.Groups.Value[2]
             Server= $_.Groups.Value[3]
             Application= $_.Groups.Value[4]
             Version= $_.Groups.Value[5]
             PID= $_.Groups.Value[6]
             Message= $_.Groups.Value[7]
            }
 }  | Export-Csv -Path C:\Users\test.csv

CodePudding user response:

It looks like you could accomplish it using this regex. Using the sample in question the objects would look like this:

Date       Time     AppName  PID   Host            Message
----       ----     -------  ---   ----            -------
2022-05-31 16:56:25 COMMAPS4 8364  win2019standard XFMDataSource process starting
2022-08-28 20:04:39 COMMAPS4 11496 Win2019standard XFMDataSource process exiting now

Code:

$re = [regex] @'
(?xi)
  (?<Date>\d{4}(?:-\d{2}){2}).*?
  (?<Time>\d{2}(?::\d{2}){2}).*?appName[\s:]*
  (?<AppName>[\w ] ).*?pid[\s:]*
  (?<PID>\d ).*?host[\s:]*
  (?<Host>[\w ] ).*?XDS[\s:]*
  (?<Message>[\w ] )
'@

$log = Get-Content path\to\log.txt -Raw

$re.Matches($log) | ForEach-Object {
    $out = [ordered]@{}
    foreach($group in $_.Groups) {
        if($group.Name -eq 0) { continue }
        $out[$group.Name] = $group.Value
    }
    [pscustomobject] $out
} | Export-Csv path\to\export.csv -NoTypeInformation
  • Related