Good day
I want to count all the “,” in a line (row) 1 in a file and then compare it to each line thereafter, if there are less “,” that in line one, add the difference “,” at the end of that line into a new file, by making use of Powershell.
How do I create a new file with the missing delimiters?
Any ideas and help will be appreciated.
Data File:
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,6,7,8,9
1,2,3,4,5,6,7,8
1,2,3,4,5,6,7
1,2,3,4,5,6
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5
1,2,3,4
1,2,3
1,2
New Data File Needed:
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,6,7,8,9,
1,2,3,4,5,6,7,8,,
1,2,3,4,5,6,7,,,
1,2,3,4,5,6,,,,
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,,,,,
1,2,3,4,,,,,,
1,2,3,,,,,,,
1,2,,,,,,,,
Here is my script:
$file = 'C:\Test\File1.csv'
$i = 1
$reader = [System.IO.File]::OpenText($file)
$line = $reader.ReadLine()
$reader.Close()
$delimiter = $line.Split(",").Count
$reader = [System.IO.File]::OpenText($file)
try {
for() {
$line = $reader.ReadLine()
if ($line -eq $null) { break }
$c = $line.Split(",").Count
if($c -ne $delimiter -and $i -ne ${lines}) {
($d = $delimiter - $c)
if($c -ne $delimiter) {
$varline = $line ",".PadRight($d, ",")
$line -replace $line, $varline
Write-Host "Line number: $i"
Write-Host "Delimiters in line: $c"
Write-Host "Delimiters should be: $delimiter"
Write-Host "Diffrence: $d"
Write-Host "New Line: $varline"
}
}
$i
}
}
finally {
$reader.Close()
}
Set-Content "C:\Test\TestMod.csv" -Value $varline
Write-Host Output: (Only for testing purposes, not needed)
1
1,2,3,4,5,6,7,8,9,
Line number: 2
Delimiters in line: 9
Delimiters should be: 10
Diffrence: 1
New Line: 1,2,3,4,5,6,7,8,9,
2
1,2,3,4,5,6,7,8,,
Line number: 3
Delimiters in line: 8
Delimiters should be: 10
Diffrence: 2
New Line: 1,2,3,4,5,6,7,8,,
3
1,2,3,4,5,6,7,,,
Line number: 4
Delimiters in line: 7
Delimiters should be: 10
Diffrence: 3
New Line: 1,2,3,4,5,6,7,,,
4
1,2,3,4,5,6,,,,
Line number: 5
Delimiters in line: 6
Delimiters should be: 10
Diffrence: 4
New Line: 1,2,3,4,5,6,,,,
5
1,2,3,4,5,,,,,
Line number: 7
Delimiters in line: 5
Delimiters should be: 10
Diffrence: 5
New Line: 1,2,3,4,5,,,,,
6
1,2,3,4,,,,,,
Line number: 8
Delimiters in line: 4
Delimiters should be: 10
Diffrence: 6
New Line: 1,2,3,4,,,,,,
8
1,2,,,,,,,,
Line number: 10
Delimiters in line: 2
Delimiters should be: 10
Diffrence: 8
New Line: 1,2,,,,,,,,
CodePudding user response:
You can build an arbitrary processing pipeline for modifying each line in a text file and outputting the results to a new text file, like this:
Get-Content $outputPath |ForEach-Object { <# process file content here #> } |Set-Content $outputPath
Since the processing of every single line except for that first one consists of the same operations every time:
- Count number of
,
's - Calculate difference from number of
,
's to the count from the first line - Append trailling
,
's
... all we need is two variables - one to keep track of whether we've inspected the first line, and one to hold the comma-count:
$inFile = 'C:\Test\File1.csv'
$outFile = 'C:\Test\output.csv'
$initialized = $false
$commaCount = 0
Get-Content -LiteralPath $inFile |ForEach-Object {
if(-not $initialized){
# first line, let's just count the commas
$commaCount = $_.Split(',').Length - 1
$initialized = $true
} else {
# subsequent lines, calculate the difference in number of commas
$commaCountDiff = $commaCount - ($_.Split(',').Length - 1)
if($commaCountDiff -gt 0){
# and add missing ones
$_ = "${_}$(',' * $commaCountDiff)"
}
}
# output the line (modified or not)
$_
} |Set-Content -LiteralPath $outFile
CodePudding user response:
You could also do this by looping the data lines twice:
$data = Get-Content -Path 'C:\Test\File1.csv'
# first loop to get the maximum number of fields in the data
$maxFields = ($data | ForEach-Object {($_ -split ',').Count} | Measure-Object -Maximum).Maximum
# second loop to output the lines with extra commas (empty fields) appended
$data | ForEach-Object {
$_ ',' * ($maxFields - ($_ -split ',').Count)
} | Set-Content -Path 'C:\Test\TestMod.csv'
Output:
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,6,7,8,9,
1,2,3,4,5,6,7,8,,
1,2,3,4,5,6,7,,,
1,2,3,4,5,6,,,,
1,2,3,4,5,6,7,8,9,10
1,2,3,4,5,,,,,
1,2,3,4,,,,,,
1,2,3,,,,,,,
1,2,,,,,,,,
CodePudding user response:
A big thanks to Mathias R. Jessen
Here is the code that worked for me:
$inFile = 'C:\Test\File1.csv'
$outFile = 'C:\Test\output.csv'
$initialized = $false
$commaCount = 0
Get-Content -LiteralPath $inFile |ForEach-Object {
if(-not $initialized){
# first line, let's just count the commas
$commaCount = $_.Split(',').Length - 1
$initialized = $true
} else {
# subsequent lines, calculate the difference in number of commas
$commaCountDiff = $commaCount - $_.Split(',').Length 1
if($commaCountDiff -gt 0){
# and add missing ones
$_ = "${_}$(',' * $commaCountDiff)"
}
}
# output the line (modified or not)
$_
} |Set-Content -LiteralPath $outFile