Home > Software engineering >  How to get number of delimiters in line 1 and adding the number of delimiters to the end of the line
How to get number of delimiters in line 1 and adding the number of delimiters to the end of the line

Time:11-05

Good day

I want to count all the “,” in a line (row) 1 in a file and then compare it to each line thereafter, if there are less “,” that in line one, add the difference “,” at the end of that line into a new file, by making use of Powershell.

How do I create a new file with the missing delimiters?

Any ideas and help will be appreciated.

Data File:

1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,6,7,8,9
1,2,3,4,5,6,7,8
1,2,3,4,5,6,7
1,2,3,4,5,6
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5
1,2,3,4
1,2,3
1,2

New Data File Needed:

1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,6,7,8,9,
1,2,3,4,5,6,7,8,,
1,2,3,4,5,6,7,,,
1,2,3,4,5,6,,,,
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,,,,,
1,2,3,4,,,,,,
1,2,3,,,,,,,
1,2,,,,,,,,

Here is my script:

$file = 'C:\Test\File1.csv'    
$i = 1
$reader = [System.IO.File]::OpenText($file)
$line = $reader.ReadLine()
$reader.Close()
$delimiter = $line.Split(",").Count
$reader = [System.IO.File]::OpenText($file)
     try {
        for() {
            $line = $reader.ReadLine()
             if ($line -eq $null) { break }
         $c = $line.Split(",").Count
         if($c -ne $delimiter -and $i -ne ${lines})  {
            ($d = $delimiter - $c) 
            if($c -ne $delimiter) {
            $varline = $line   ",".PadRight($d, ",") 
            $line -replace $line, $varline

            
                
            Write-Host "Line number: $i"
            Write-Host "Delimiters in line: $c"
            Write-Host "Delimiters should be: $delimiter" 
            Write-Host "Diffrence: $d"
            Write-Host "New Line: $varline"
            } 
                 }
                 $i  
         } 
     } 
     finally {
        $reader.Close()
     }
Set-Content "C:\Test\TestMod.csv" -Value $varline

Write-Host Output: (Only for testing purposes, not needed)

1
1,2,3,4,5,6,7,8,9,
Line number: 2
Delimiters in line: 9
Delimiters should be: 10
Diffrence: 1
New Line: 1,2,3,4,5,6,7,8,9,
2
1,2,3,4,5,6,7,8,,
Line number: 3
Delimiters in line: 8
Delimiters should be: 10
Diffrence: 2
New Line: 1,2,3,4,5,6,7,8,,
3
1,2,3,4,5,6,7,,,
Line number: 4
Delimiters in line: 7
Delimiters should be: 10
Diffrence: 3
New Line: 1,2,3,4,5,6,7,,,
4
1,2,3,4,5,6,,,,
Line number: 5
Delimiters in line: 6
Delimiters should be: 10
Diffrence: 4
New Line: 1,2,3,4,5,6,,,,
5
1,2,3,4,5,,,,,
Line number: 7
Delimiters in line: 5
Delimiters should be: 10
Diffrence: 5
New Line: 1,2,3,4,5,,,,,
6
1,2,3,4,,,,,,
Line number: 8
Delimiters in line: 4
Delimiters should be: 10
Diffrence: 6
New Line: 1,2,3,4,,,,,,
8
1,2,,,,,,,,
Line number: 10
Delimiters in line: 2
Delimiters should be: 10
Diffrence: 8
New Line: 1,2,,,,,,,,

CodePudding user response:

You can build an arbitrary processing pipeline for modifying each line in a text file and outputting the results to a new text file, like this:

Get-Content $outputPath |ForEach-Object { <# process file content here #> } |Set-Content $outputPath

Since the processing of every single line except for that first one consists of the same operations every time:

  • Count number of ,'s
  • Calculate difference from number of ,'s to the count from the first line
  • Append trailling ,'s

... all we need is two variables - one to keep track of whether we've inspected the first line, and one to hold the comma-count:

$inFile = 'C:\Test\File1.csv'
$outFile = 'C:\Test\output.csv'

$initialized = $false
$commaCount  = 0

Get-Content -LiteralPath $inFile |ForEach-Object {
  if(-not $initialized){
    # first line, let's just count the commas
    $commaCount = $_.Split(',').Length - 1
    $initialized = $true
  } else {
    # subsequent lines, calculate the difference in number of commas 
    $commaCountDiff = $commaCount - ($_.Split(',').Length - 1)
    if($commaCountDiff -gt 0){
      # and add missing ones 
      $_ = "${_}$(',' * $commaCountDiff)"
    }
  }

  # output the line (modified or not)
  $_
} |Set-Content -LiteralPath $outFile

CodePudding user response:

You could also do this by looping the data lines twice:

$data = Get-Content -Path 'C:\Test\File1.csv'
# first loop to get the maximum number of fields in the data
$maxFields = ($data | ForEach-Object {($_ -split ',').Count} | Measure-Object -Maximum).Maximum
# second loop to output the lines with extra commas (empty fields) appended
$data | ForEach-Object {
    $_   ',' * ($maxFields - ($_ -split ',').Count)
} | Set-Content -Path 'C:\Test\TestMod.csv'

Output:

1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,6,7,8,9,
1,2,3,4,5,6,7,8,,
1,2,3,4,5,6,7,,,
1,2,3,4,5,6,,,,
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,,,,,
1,2,3,4,,,,,,
1,2,3,,,,,,,
1,2,,,,,,,,

CodePudding user response:

A big thanks to Mathias R. Jessen

Here is the code that worked for me:

$inFile = 'C:\Test\File1.csv'
$outFile = 'C:\Test\output.csv'

$initialized = $false
$commaCount  = 0

Get-Content -LiteralPath $inFile |ForEach-Object {
  if(-not $initialized){
    # first line, let's just count the commas
    $commaCount = $_.Split(',').Length - 1
    $initialized = $true
  } else {
    # subsequent lines, calculate the difference in number of commas 
    $commaCountDiff = $commaCount - $_.Split(',').Length  1
    if($commaCountDiff -gt 0){
      # and add missing ones 
      $_ = "${_}$(',' * $commaCountDiff)"
    }
  }

  # output the line (modified or not)
  $_
} |Set-Content -LiteralPath $outFile
  • Related