Home > Back-end >  Export the matched output to a CSV as Column 1 and Column 2 using Powershell
Export the matched output to a CSV as Column 1 and Column 2 using Powershell

Time:06-12

I have below code to match the pattern and save it in CSV file. I need to save regex1 and regex2 as col1 and col2 in csv instead of saving all in 1st col.

 $inputfile = ( Get-Content D:\Users\naham1224\Desktop\jil.txt )
 $FilePath = "$env:USERPROFILE\Desktop\jil2.csv"
 $regex1 = "(insert_job: [A-Za-z]*_*\S*)"
 $regex2 = "(machine: [A-Z]*\S*)" 
 $inputfile | 
         Select-String -Pattern $regex2,$regex1 -AllMatches | 
         ForEach-Object {$_.matches.groups[1].value} | 
         Add-Content $FilePath`

Input file contains : input.txt

/* ----------------- AUTOSYS_DBMAINT ----------------- */ 

insert_job: AUTOSYS_DBMAINT   job_type: CMD 
command: %AUTOSYS%\bin\DBMaint.bat
machine: PWISTASASYS01
owner: svc.autosys@cbs
permission: 
date_conditions: 1
days_of_week: su,mo,tu,we,th,fr,sa
start_times: "03:30"
description: "Runs DBmaint process on AE Database - if fails - MTS - will run next scheduled time"
std_out_file: ">$$LOGS\dbmaint.txt"
std_err_file: ">$$LOGS\dbmaint.txt"
alarm_if_fail: 0
alarm_if_terminated: 0
send_notification: 0
notification_msg: "Check DBMaint output in autouser.PD1\\out directory"
notification_emailaddress: [email protected]

/* ----------------- TEST_ENV ----------------- */ 

insert_job: TEST_ENV   job_type: CMD 
command: set
machine: PWISTASASYS01
owner: svc.autosys@cbs
permission: 
date_conditions: 1
days_of_week: su,mo,tu,we,th,fr,sa
start_times: "03:30"
description: "output env"
std_out_file: ">C:\Users\svc.autosys\Documents\env.txt"
std_err_file: ">C:\Users\svc.autosys\Documents\env.txt"
alarm_if_fail: 1
alarm_if_terminated: 1

Current output :

Current output

Expected output :

Expected output

I am trying various ways to do so but no luck. any suggestions and help is greatly appreciated.

CodePudding user response:

Here is how I would do this:

$inputPath = 'input.txt'
$outputPath = 'output.csv'

# RegEx patterns to extract data.
$patterns = @(
    '(insert_job): ([A-Za-z]*_*\S*)'
    '(machine): ([A-Z]*\S*)'
)

# Create an ordered Hashtable to collect columns for one row.
$row = [ordered] @{}

# Loop over all occurences of the patterns in input file
Select-String -Path $inputPath -Pattern $patterns -AllMatches | ForEach-Object {

    # Extract key and value from current match
    $key   = $_.matches.Groups[ 1 ].Value
    $value = $_.matches.Value

    # Save one column of current row.
    $row[ $key ] = $value

    # If we have all columns of current row, output it as PSCustomObject.
    if( $row.Count -eq $patterns.Count ) {

        # Convert hashtable to PSCustomObject and output (implicitly)
        [PSCustomObject] $row
    
        # Clear Hashtable in preparation for next row.
        $row.Clear()
    }
} | Export-Csv $outputPath -NoTypeInformation

Output CSV:

"insert_job","machine"
"insert_job: AUTOSYS_DBMAINT","machine: PWISTASASYS01"
"insert_job: TEST_ENV","machine: PWISTASASYS01"

Remarks:

  • Using Select-String with parameter -Path we don't have to read the input file beforehand.

  • An ordered Hashtable (a dictionary) is used to collect all columns, until we have an entire row to output. This is the crucial step to produce multiple columns instead of outputting all data in a single column.

  • Converting the Hashtable to a PSCustomObject is necessary because Export-Csv expects objects, not dictionaries.

  • While the CSV looks like your "expected output" and you possibly have good reason to expect it like that, in a CSV file the values normally shouldn't repeat the column names. To remove the column names from the values, simply replace $value = $_.matches.Value by $_.matches.Groups[ 2 ].Value, which results in an output like this:

    "insert_job","machine"
    "AUTOSYS_DBMAINT","PWISTASASYS01"
    "TEST_ENV","PWISTASASYS01"
    

As for what you have tried:

Add-Content writes only plain text files from string input. While you could use it to create CSV files, you would have to add separators and escape strings all by yourself, which is easy to get wrong and more hassle than necessary. Export-CSV otoh takes objects as inputs and cares about all of the CSV format details automatically.

CodePudding user response:

As zett42 mentioned Add-Content is not the best fit for this. Since you are looking for multiple values separated by commas Export-Csv is something you can use. Export-Csv will take objects from the pipeline, convert them to lines of comma-separated properties, add a header line, and save to file

I took a little bit of a different approach here with my solution. I've combined the different regex patterns into one which will give us one match that contains both the job and machine names.

$outputPath = "$PSScriptRoot\output.csv"

# one regex to match both job and machine in separate matching groups
$regex = '(?s)insert_job: (\w ). ?machine: (\w )'

# Filter for input files
$inputfiles = Get-ChildItem -Path $PSScriptRoot -Filter input*.txt

# Loop through each file
$inputfiles |
    ForEach-Object {
        $path = $_.FullName
        Get-Content -Raw -Path $path | Select-String -Pattern $regex -AllMatches |
            ForEach-Object {
                # Loop through each match found in the file.
                # Should be 2, one for AUTOSYS_DBMAINT and another for TEST_ENV
                $_.Matches | ForEach-Object {
                    # Create objects with the values we want that we can output to csv file
                    [PSCustomObject]@{
                        # remove next line if not needed in output
                        InputFile = $path
                        Job       = $_.Groups[1].Value  # 1st matching group contains job name
                        Machine   = $_.Groups[2].Value  # 2nd matching group contains machine name
                    }
                }
            }
    } | Export-Csv $outputPath  # Pipe our objects to Export-Csv

Contents of output.csv

"InputFile","Job","Machine"
"C:\temp\powershell\input1.txt","AUTOSYS_DBMAINT","PWISTASASYS01"
"C:\temp\powershell\input1.txt","TEST_ENV","PWISTATEST2"
"C:\temp\powershell\input2.txt","AUTOSYS_DBMAINT","PWISTASAPROD1"
"C:\temp\powershell\input2.txt","TEST_ENV","PWISTATTEST1"
  • Related