Home > Enterprise >  powershell txt file delimited by " " to csv
powershell txt file delimited by " " to csv

Time:10-15

I have a text file that is essentially two columns with n rows.

the columns are separated by a double space - powershells import-csv cmdlet only allows for a single character delimiter.

Whats my best way around this as my aim is to get my first column into a variable.

CodePudding user response:

A simple approach if your current delimiter is only present as a delimiter would be to pick a new delimiting character that is not present anywhere in the file. If we choose ;, you may do the following:

$content = (Get-Content file.csv) -replace '  ',';' | 
    ConvertFrom-Csv -Delimiter ';'
# outputs column1 values
$content.column1

If you cannot guarantee your delimiter won't be within the file, you will need to qualify the current text (usually with double quotes) to prevent the character from being wrongly accused of being a delimiter. This only needs to be done if your data is not qualified already.

$content = (Get-Content file.csv) -replace '(?=")|^|$|(?<=  )|(?=  )','"' -replace '  ',';' |
    ConvertFrom-Csv -Delimiter ';'

If your data is already qualified, then you will only want to replace the double space where it appears as a delimiter. But you will need to be mindful of qualifying characters appearing as data as well.

$content = (Get-Content file.csv) -replace '(?<!^|"  )"(?!$|  ")','""' -replace '(?<=")  (?=")',';' |
    ConvertFrom-Csv -Delimiter ';'

I am sure exceptions can be created for any of the approaches above. You will need to know your data and come up with verification checks to validate your data schema.

CodePudding user response:

AdminOfThings' helpful answer provides good general pointers and generic solutions for parsing all columns.

Re:

to get my first column into a variable

# Create a sample file.
@'
foo  ...
one two  ...
more  ...
'@ > sample.txt

# Extract the first column.
$col1Values = (Get-Content sample.txt) -replace '  .*$'

Outputting $col1Values yields:

foo
one two
more

Note: If the first column's values are double-quoted (and don't contain embedded, escaped double quotes), append -replace '"' (or (...).Trim('"'))

  • Related