Home > Mobile >  Filtering JSON by timestamp in powershell
Filtering JSON by timestamp in powershell

Time:02-23

I have retrieved json log data from a rest API as follows

[
  {
    "id": "6523276",
    "type": "logs",
    "attributes": {
        "created-at": "2022-02-22T10:50:26Z",
        "action": "delete",
        "resource-name": "DocumentABC.docx",
        "user-name": "Joe Smith"
        }
  },
  {
    "id": "6523275",
    "type": "logs",
    "attributes": {
        "created-at": "2022-02-22T10:03:22Z",
        "action": "create",
        "resource-name": "Document123.docx",
        "user-name": "Joe Smith"
    }
  },
  {
    "id": "6523274",
    "type": "logs",
    "attributes": {
        "created-at": "2022-02-22T06:42:21Z",
        "action": "open",
        "resource-name": "123Document.docx",
        "user-name": "Joe Smith"
    }
  }
]

I need to Post the json to another web app but I only want the last hour of logs. In the json example above, the current time was 2022-02-22T10:55:22Z, therefore I'm only interested in the first two log entries.

For example

[
  {
    "id": "6523276",
    "type": "logs",
    "attributes": {
        "created-at": "2022-02-22T10:50:26Z",
        "action": "delete",
        "resource-name": "DocumentABC.docx",
        "user-name": "Joe Smith"
        }
  },
  {
    "id": "6523275",
    "type": "logs",
    "attributes": {
        "created-at": "2022-02-22T10:03:22Z",
        "action": "create",
        "resource-name": "Document123.docx",
        "user-name": "Joe Smith"
    }
  }
]

Here is my powershell v7 script

$json = $json | ConvertFrom-Json
$filterTime = (Get-date).AddHours(-1)
$RFCfilterTime = [Xml.XmlConvert]::ToString($filterTime,[Xml.XmlDateTimeSerializationMode]::Utc)
$Filteredjson = $json | Where-Object $json.attributes[0] -ge $RFCfilterTimefilterDate  
$jsonToPost = ConvertTo-Json -InputObject @($Filteredjson) -Depth 5

The problem is ConvertFrom-Json changes the 'created-at' from RFC3339 format to 'datetime' format. Therefore the Where-Object filter doesn't work...

id                 type attributes
--                 ---- ----------
6523276            logs @{created-at=22/02/2022 10:50:26 AM; action…
6523275            logs @{created-at=22/02/2022 10:03:22 AM; action…
6523274            logs @{created-at=22/02/2022 6:42:21 AM; action=…

How do I change all of the 'created-at' objects back to RCF3339 format?

Is the $json | Where-Object $json.attributes[0] -ge $RFCfilterTimefilterDate
statement being used correctly?

Is there any easier way altogether?

CodePudding user response:

Your approach should work in principle, but there was a problem with your Where-Object statement - see the bottom section.
Mathias' answer shows how to work with the [datetime] instances that result from ConvertTo-Json's parsing directly, but a bit more work is required:

  • Indeed, in PowerShell (Core) v6 ConvertFrom-Json (which with JSON web services is used implicitly by Invoke-RestMethod) automatically deserializes ISO 8601-format date-time strings such as "2022-02-22T10:03:22Z" into [datetime] System.DateTime instances, and, conversely, on (re)serialization with ConvertTo-Json, [datetime] instances are (re)converted to ISO 8601 strings.

  • While this enables convenient chronological comparisons with other [datetime] instances, such as returned by Get-Date, there is a major pitfall: Only [datetime] instances that have the same .Kind property value compare meaningfully (possible values are Local, Utc, and Unspecified, the latter being treated like Local in comparisons).

  • Unfortunately, as of PowerShell 7.2.1, you don't get to control what .Kind of [datetime] instances Convert-FromJson constructs - it is implied by the specific date-time string format of each string recognized as an ISO 8601 date.

    • Similarly, on (re)serialization with ConvertTo-Json, the .Kind value determines the string format.
    • See this answer for details.

In your case, because your date-time strings have the Z suffix denoting UTC, [datetime] instances with .Kind Utc are constructed.

Therefore, you need to ensure that your comparison timestamp is a Utc [datetime] too, which calling .ToUniversalTime() on the Local instance that Get-Date outputs ensures:

# Note the need for .ToUniversalTime()
$filterTime = (Get-Date).ToUniversalTime().AddHours(-1)
# Note: Only works as intended if all date-time strings are "Z"-suffixed
$filteredData = $data | Where-Object { $_.attributes.'created-at' -ge $filterTime }

However, at least hypothetically a given JSON document may contain differing date-time string formats that result in different .Kind values.

The way to handle this case - as well as the case where the string format is consistent, but not necessarily known ahead of time - you can use the generally preferable [datetimeoffset] (System.DateTimeOffset) type, which automatically recognizes timestamps as equivalent even if their expression (local vs. UTC) is different:

# Note the use of [datetimeoffset]
[datetimeoffset] $filterTime = (Get-Date).AddHours(-1)
# With this approach the specific format of the date-time strings is irrelevant,
# as long as they're recognized as ISO 8601 strings.
$filteredData = $data | 
  Where-Object { [datetimeoffset] $_.attributes.'created-at' -ge $filterTime }

Note: Strictly speaking, it is sufficient for the LHS of the comparison to be of type [datetimeoffset] - a [datetime] RHS is then also handled correctly.

Potential future improvements:

  • GitHub issue #13598 proposes adding a -DateTimeKind parameter to ConvertFrom-Json, so as to allow explicitly requesting the kind of interest, and to alternatively construct [datetimeoffset] instances.

As for what you tried:

Is the $json | Where-Object $json.attributes[0] -ge $RFCfilterTimefilterDate statement being used correctly?

No:

  • You're using simplified syntax in which the LHS of the comparison (the -Property parameter) must be the name of a single (non-nested) property directly available on each input object.

  • Because nested property access is required in your case, the regular script-block-based syntax ({ ... }) must be used, in which case the input object at hand must be referenced explicitly via the automatic $_ variable.

  • .attributes[0] suggests you were trying to access the created-at property by index, which, however, isn't supported in PowerShell; you need to:

    • either: spell out the property's name, if known: $_.attributes.'created-at' - note the need to quote in this case, due to use of the nonstandard - char. in the name.

    • or: use the intrinsic .psobject member that provides reflection information about any given object: $_.attributes.psobject.Properties.Value[0]

Thus, with spelling out the property name, your statement should have been:

$json | Where-Object { $_.attributes.'created-at' -ge $RFCfilterTimefilterDate }

CodePudding user response:

The fact that newer version of ConvertFrom-Json implicitly parses timestamps as [datetime] is actually to your advantage - [datetime] values are comparable, so this simply means you can skip the step where you convert the threshold value to a string:

$data = $json | ConvertFrom-Json
$filterTime = (Get-Date).AddHours(-1)
$filteredData = $data | Where-Object {$_.attributes.'created-at' -ge $filterTime}
$jsonToPost = ConvertTo-Json -InputObject @($filteredData) -Depth 5
  • Related