I am currently struggling with expanding the JSON that is embedded in one particular column of a CSV table using PowerShell. I have summarized the problem below, maybe someone of you can help.
Problem:
I have written a script to export log files from a data source into CSV. The CSV contains a set of columns. In certain situations I want to reduce the amount of columns of the CSV and expand any embedded JSON within the CSV so I can perform additional processing steps on the file. Selecting just the relevant columns I can do with:
$csvFile = Import-Csv -Path $filePath -Delimiter ","
$csvFileCompact = $csvFile | Select-Object -Property CreationDate, UserIds, Operations, AuditData
However, the column "AuditData" itself contains JSON formatted entries which I would like to expand and then add as new columns to the Table resulting into the table changing to "CreationDate, UserIDs, Operations, AuditData_1, AuditData_2, ..." with AuditData_1, AuditData_2 etc. being the expanded JSON entries.
I tried to play around with ConvertFrom-JSON
and to access the objects data on several ways ($obj | Select -ExpandProperty "SomeProp")
or $obj."SomeProp"
to apply it on "AuditData" but seem to miss something here. I started to play with PowerShell just yesterday due to a sudden "demand", so I am a bit under time pressure and probably missing something super obvious here.
Edit: Adding some example data just to give some additional insights. Will test the suggestions in some hours but looks already very promising!
the following JSON is embedded in the linked CSV:
{"CreationTime":"2022-04-06T18:56:16","Id":"abc","Operation":"Update","OrganizationId":"abcd","RecordType":2,"ResultStatus":"Succeeded","UserKey":"1000000","UserType":0,"Version":1,"Workload":"Exchange","ClientIP":"2000:0000:000:000:abcd::d","UserId":"[email protected]","AppId":"abcde","ClientAppId":"abcdef","ClientIPAddress":"2000:0000:000:000:aaaa::d","ClientInfoString":"Client=REST;Client=RESTSystem;;","ClientRequestId":"abcd","ExternalAccess":false,"InternalLogonType":0,"LogonType":0,"LogonUserSid":"S-1","MailboxGuid":"abcdefg","MailboxOwnerSid":"S-2","MailboxOwnerUPN":"[email protected]","OrganizationName":"company.onmicrosoft.com","OriginatingServer":"V (10.00.0000.000)\r\n","Item":{"Attachments":" (10000b)","Id":"0000","InternetMessageId":"<000000.eurprd09.prod.outlook.com>","IsRecord":false,"ParentFolder":{"Id":"000001234","Path":"\\Calendar\\Person"},"SizeInBytes":123456,"Subject":"Hi"},"ModifiedProperties":["AttachmentCollection","HtmlBody","CreationTime"]}
CodePudding user response:
You might use a Select-Object
calculated property for this (see Example 11: Create calculated properties for each InputObject):
In the form of an expression:
$Object = $csvFile | Select-Object -Property CreationDate, UserIds, Operations,
{ $_.AuditData |ConvertFrom-Json }
Or in the form of a hash table:
$Object = $csvFile | Select-Object -Property CreationDate, UserIds, Operations,
@{ name = AuditData; Expression = { $_.AuditData |ConvertFrom-Json } }
From here you can further drill down into the json object properties, like:
$Object.AuditData.SomeProp
You might also consider to create a new calculated property out of the json (sub) properties of the $Object
:
$Object |Select-Object -Property CreationDate, UserIds, Operations, AuditData,
@{ name = SomeProp; Expression = { $_.AuditData.SomeProp } }