I'm trying to write my first PowerShell script and I'm struggling to get my head around the object orientation and piping approach. So I've got a directory structure like this (Google Takeout Nest directories):
Path\DeviceID-A\
- Year\
- Month\
- year-month-sensors-a.csv
Path\DeviceID-B\
- Year\
- Month\
- year-month-sensors-b.csv
...
I want to aggregate all the CSVs together and I've written a simple script that does this:
Get-ChildItem -Recurse -Filter *.csv | Select-Object -ExpandProperty FullName | Import-Csv | Export-Csv merged.csv -NoTypeInformation -Append
However, I lose the context of the DeviceID when everything is aggregated together. So I'd like to add an additional column that contains those initial root directories, with something similar to:
Import-Csv merged.csv | Select-Object *,@{Name='nest_id';Expression={$nestDeviceID}} | Export-Csv merged-v2.csv -NoTypeInformation
So each row in year-month-sensors-a.csv would have an additional column called nest_id
with the value DeviceID-A
for any CSV under that root directory.
...but I can't quite reconcile the two scripts on how to programmatically iterate through them, whether I need to use ForEach-Object
or piping.
Any advice would be appreciated!
CodePudding user response:
Before posting the code, I think it's worth mentioning, this should work as long as all CSVs have always the same headers / columns. If this was not the case something like this could happen:
@'
Col1,Col2
Val1,Val2
'@ |
ConvertFrom-Csv |
Export-Csv ./test.csv -NoTypeInformation
@'
Col1,NewCol
Val1,NewVal
'@ |
ConvertFrom-Csv |
Export-Csv ./test.csv -NoTypeInformation -Append
Doing this would throw the following Exception:
Cannot append CSV content to the following file: ./test.csv. The appended object does not have a property that corresponds to the following column: Col2. To continue with mismatched properties, add the -Force parameter, and then retry the command.
To answer your question, you're quite close however, I don't think this is possible as a one liner and if it was, I would certainly not recommend it (others reading your code will thank you for that).
By one liner I mean, pipping Get-ChildItem
to Import-Csv
without using a loop (ForEach-Object
or foreach
). This is where you lose reference of DeviceID
folder name.
Points to clarify:
- The code will first store the
$mergedCsv
in memory, appending (-Append
) to a Csv on each loop iteration when it's not needed will cause a big amount of unnecessary Disk I/O, hence, not efficient. It might not be important for you in this case, but is something good to know for your future scripts. $file.Directory.Parent.Parent.Name
should give you theDeviceID
folders name, please correct if I'm wrong.- Each loop iteration will be importing each CSV and appending the new
Nest_ID
column which contains theDeviceID
folder name.
$mergedCsv = foreach($file in Get-ChildItem -Recurse -Filter *.csv)
{
$nestDeviceID = $file.Directory.Parent.Parent.Name
$csv = Import-Csv $file.FullName
$csv | Select-Object *, @{
Name = 'Nest_ID'
Expression = { $nestDeviceID }
}
}
# Inspect the Merged CSV
$mergedCsv | Out-GridView
# Export it
$mergedCsv | Export-Csv merged.csv -NoTypeInformation