Home > Software design >  Expand JSON embedded in CSV column using PowerShell
Expand JSON embedded in CSV column using PowerShell

Time:05-04

I am currently struggling with expanding the JSON that is embedded in one particular column of a CSV table using PowerShell. I have summarized the problem below, maybe someone of you can help.

Problem:

I have written a script to export log files from a data source into CSV. The CSV contains a set of columns. In certain situations I want to reduce the amount of columns of the CSV and expand any embedded JSON within the CSV so I can perform additional processing steps on the file. Selecting just the relevant columns I can do with:

$csvFile = Import-Csv -Path $filePath -Delimiter ","
$csvFileCompact = $csvFile | Select-Object -Property CreationDate, UserIds, Operations, AuditData

However, the column "AuditData" itself contains JSON formatted entries which I would like to expand and then add as new columns to the Table resulting into the table changing to "CreationDate, UserIDs, Operations, AuditData_1, AuditData_2, ..." with AuditData_1, AuditData_2 etc. being the expanded JSON entries.

I tried to play around with ConvertFrom-JSON and to access the objects data on several ways ($obj | Select -ExpandProperty "SomeProp") or $obj."SomeProp" to apply it on "AuditData" but seem to miss something here. I started to play with PowerShell just yesterday due to a sudden "demand", so I am a bit under time pressure and probably missing something super obvious here.

Edit: Adding some example data just to give some additional insights. Will test the suggestions in some hours but looks already very promising!

https://easyupload.io/376oki

the following JSON is embedded in the linked CSV:

{"CreationTime":"2022-04-06T18:56:16","Id":"abc","Operation":"Update","OrganizationId":"abcd","RecordType":2,"ResultStatus":"Succeeded","UserKey":"1000000","UserType":0,"Version":1,"Workload":"Exchange","ClientIP":"2000:0000:000:000:abcd::d","UserId":"[email protected]","AppId":"abcde","ClientAppId":"abcdef","ClientIPAddress":"2000:0000:000:000:aaaa::d","ClientInfoString":"Client=REST;Client=RESTSystem;;","ClientRequestId":"abcd","ExternalAccess":false,"InternalLogonType":0,"LogonType":0,"LogonUserSid":"S-1","MailboxGuid":"abcdefg","MailboxOwnerSid":"S-2","MailboxOwnerUPN":"[email protected]","OrganizationName":"company.onmicrosoft.com","OriginatingServer":"V (10.00.0000.000)\r\n","Item":{"Attachments":" (10000b)","Id":"0000","InternetMessageId":"<000000.eurprd09.prod.outlook.com>","IsRecord":false,"ParentFolder":{"Id":"000001234","Path":"\\Calendar\\Person"},"SizeInBytes":123456,"Subject":"Hi"},"ModifiedProperties":["AttachmentCollection","HtmlBody","CreationTime"]}

CodePudding user response:

You might use a Select-Object calculated property for this (see Example 11: Create calculated properties for each InputObject):

In the form of an expression:

$Object = $csvFile | Select-Object -Property CreationDate, UserIds, Operations,
    { $_.AuditData |ConvertFrom-Json }

Or in the form of a hash table:

$Object = $csvFile | Select-Object -Property CreationDate, UserIds, Operations,
    @{ name = AuditData; Expression = { $_.AuditData |ConvertFrom-Json } }

From here you can further drill down into the json object properties, like:

$Object.AuditData.SomeProp

You might also consider to create a new calculated property out of the json (sub) properties of the $Object:

$Object |Select-Object -Property CreationDate, UserIds, Operations, AuditData,
    @{ name = SomeProp; Expression = { $_.AuditData.SomeProp } }
  • Related