I use the following ps command to batch read the contents of each txt file in the folder, and then export it to csv. It reads normally, but the file contents become one line, and all the newline characters are lost
ls |Select-Object -Property {Get-Content $_.FullName -Encoding utf8} | Export-Csv "a.csv" -Encoding UTF8
I want it to keep the format when I export it (mostly Spaces and newlines)
input and output : enter image description here
CodePudding user response:
Either: Use
-Raw
withGet-Content
to ensure that each file is read as a single, multi-line string.# Note the use of -Raw Get-ChildItem | Select-Object -Property { Get-Content -Raw $_.FullName -Encoding utf8 } | Export-Csv "a.csv" -Encoding UTF8
Or, if you want to control the newline format and whether each file's lines should have a trailing newline, use
Get-Content
without-Raw
and join the resulting array of lines with the newline character/sequence of choice using-join
:# Note the -join "`n" (joining the lines with Unix-format LF-only newlines). # Each resulting string will have *no* trailing newline, but you # you can add it with ((Get-Content ...) -join "`n") "`n" # Use "`r`n" for Windows-format CRLF newlines. Get-ChildItem | Select-Object -Property { (Get-Content $_.FullName -Encoding utf8) -join "`n" } | Export-Csv "a.csv" -Encoding UTF8
Note that your CSV file:
will have one column, whose fields are
"..."
-enclosed multi-line strings.- Use something like
Select-Object Name, { ... }
to also include each file's name.
- Use something like
that column's name will be the verbatim content of the script block (
{ ... }
you're passing toSelect-Object
's-Property
parameter.To give the column a better name, use a hashtable as the calculated property, whose
Name
entry allows you to specify the name, and whoseExpression
entry must be your script block; e.g.:... | Select-Object @{ Name='FileContents'; Expression={ Get-Content -Raw $_.FullName -Encoding utf8 } } | ...
As for what you tried:
When used without
-Raw
,Get-Content
streams a file's content line by line, and when that output is captured, it becomes an array.What the script block of a calculated property evaluates to is used as a whole to determine that property's value, so an array of lines is indeed the result.
When
Export-Csv
(orConvertTo-Csv
) encounter a property value that is an array (or a collection of similar type), it stringifies it, i.e. it must create a representation of the collection as a single string:Usually, this results in useless stringification simply by the collection's type name, e.g., in the case of an array, verbatim
System.Object[]
However, if - as in your case - the collection happens to be wrapped in a usually invisible
[psobject]
instance, a more helpful, albeit single-line stringification occurs: the (stringified) collection elements are joined with spaces in effect, the newlines from your file are therefore replaced with spaces.The reason a
[psobject]
wrapper is present in your case is that PowerShell wraps output objects from a pipeline that way - which includes output from a script block ({ ... }
).
To illustrate the above:
# NO [psobject] wrapper -> CSV serialization by *type name*:
PS> [pscustomobject] @{ ArrayProp = 'one', 'two' } | ConvertTo-Csv
"ArrayProp"
"System.Object[]" # !!
# WITH [psobject] wrapper -> CSV serialization by
# *space-concatenated elements*:
PS> [pscustomobject] @{ ArrayProp = [psobject] ('one', 'two') } | ConvertTo-Csv
"ArrayProp"
"one two" # !!