Home > Back-end >  Powershell export csv lost line breaks
Powershell export csv lost line breaks

Time:11-01

I use the following ps command to batch read the contents of each txt file in the folder, and then export it to csv. It reads normally, but the file contents become one line, and all the newline characters are lost

ls |Select-Object -Property {Get-Content $_.FullName -Encoding utf8} | Export-Csv "a.csv" -Encoding UTF8

I want it to keep the format when I export it (mostly Spaces and newlines)

input and output : enter image description here

CodePudding user response:

  • Either: Use -Raw with Get-Content to ensure that each file is read as a single, multi-line string.

    # Note the use of -Raw
    Get-ChildItem |
      Select-Object -Property { Get-Content -Raw $_.FullName -Encoding utf8 } | 
      Export-Csv "a.csv" -Encoding UTF8
    
  • Or, if you want to control the newline format and whether each file's lines should have a trailing newline, use Get-Content without -Raw and join the resulting array of lines with the newline character/sequence of choice using -join:

    # Note the -join "`n" (joining the lines with Unix-format LF-only newlines).     
    # Each resulting string will have *no* trailing newline, but you
    # you can add it with ((Get-Content ...) -join "`n")   "`n"
    # Use "`r`n" for Windows-format CRLF newlines.
    Get-ChildItem |
      Select-Object -Property { (Get-Content $_.FullName -Encoding utf8) -join "`n" } | 
      Export-Csv "a.csv" -Encoding UTF8
    

Note that your CSV file:

  • will have one column, whose fields are "..."-enclosed multi-line strings.

    • Use something like Select-Object Name, { ... } to also include each file's name.
  • that column's name will be the verbatim content of the script block ({ ... } you're passing to Select-Object's -Property parameter.

    • To give the column a better name, use a hashtable as the calculated property, whose Name entry allows you to specify the name, and whose Expression entry must be your script block; e.g.:

      ... |
      Select-Object @{ Name='FileContents'; Expression={ Get-Content -Raw $_.FullName -Encoding utf8 } } | ...
      

As for what you tried:

  • When used without -Raw, Get-Content streams a file's content line by line, and when that output is captured, it becomes an array.

  • What the script block of a calculated property evaluates to is used as a whole to determine that property's value, so an array of lines is indeed the result.

  • When Export-Csv (or ConvertTo-Csv) encounter a property value that is an array (or a collection of similar type), it stringifies it, i.e. it must create a representation of the collection as a single string:

    • Usually, this results in useless stringification simply by the collection's type name, e.g., in the case of an array, verbatim System.Object[]

    • However, if - as in your case - the collection happens to be wrapped in a usually invisible [psobject] instance, a more helpful, albeit single-line stringification occurs: the (stringified) collection elements are joined with spaces in effect, the newlines from your file are therefore replaced with spaces.

    • The reason a [psobject] wrapper is present in your case is that PowerShell wraps output objects from a pipeline that way - which includes output from a script block ({ ... }).

To illustrate the above:

# NO [psobject] wrapper -> CSV serialization by *type name*:
PS> [pscustomobject] @{ ArrayProp = 'one', 'two' } | ConvertTo-Csv

"ArrayProp"
"System.Object[]"  # !!
# WITH [psobject] wrapper -> CSV serialization by 
# *space-concatenated elements*:
PS> [pscustomobject] @{ ArrayProp = [psobject] ('one', 'two') } | ConvertTo-Csv

"ArrayProp"
"one two"  # !!
  • Related