Home > other >  Powershell Issues with JSON format (log4jscanner & utf16)
Powershell Issues with JSON format (log4jscanner & utf16)

Time:01-22

I'm trying to successfully retrieve some JSON data from log4jscanner.exe (Qualys software to detect if you got a vulnerable file or component in your pc/server) but after spent many hours on it, i think i got an issue with Powershell.

If I store the result of the following command in Powershell 5.1 $a = .\Log4jScanner.exe /scan /report_pretty

The result is "displayed" like :

PS C:\temp> $a |where {$_ -ne ""}
    {
        "scanSummary": {
            "scanEngine": "2.0.2.7",
            "scanHostname": "XXXXXXXXX",
            "scanDate": "2022-01-20T18:02:26 0100",
            "scanDurationSeconds": 28,
            "scanErrorCount": 54,
            "scanStatus": "Partially Successful",
            "scannedFiles": 649020,
            "scannedDirectories": 209514,
            "scannedJARs": 31,
            "scannedWARs": 0,
            "scannedEARs": 0,
            "scannedPARs": 0,
            "scannedTARs": 5,
            "scannedCompressed": 43,
            "vulnerabilitiesFound": 1
        },
        "scanDetails": [
            {
                "file": "XXXXXX.jar",
                "manifestVendor": "Unknown",
                "manifestVersion": "Unknown",
                "detectedLog4j": true,
                "detectedLog4j1x": true,
                "detectedLog4j2x": false,
                "detectedJNDILookupClass": false,
                "detectedLog4jManifest": false,
                "log4jVendor": "log4j",
                "log4jVersion": "1.2.17",
                "cve20214104Mitigated": false,
                "cve202144228Mitigated": true,
                "cve202144832Mitigated": true,
                "cve202145046Mitigated": true,
                "cve202145105Mitigated": true,
                "cveStatus": "Potentially Vulnerable ( CVE-2021-4104: Found )"
            }
        ]
    }

After that, i want to convert that data to work on a specifical value, first of all i try to convert data from json, here the text goes RED and the following error happened :

    PS C:\temp> $a | convertfrom-json
    convertfrom-json : Objet non valide passé, ':' ou '}' attendu. (2): {
    
        "scanSummary": {
    
            "scanEngine": "2.0.2.7",
    
            "scanHostname": "FRBOURWXT013379.vcn.ds.volvo.net",
    
            "scanDate": "2022-01-20T18:02:26 0100",
 .... .... ....

Finally, if I copy/paste the content of $a into another variable like

$b = '
{
            "scanSummary": {
                "scanEngine": "2.0.2.7",
                "scanHostname": "XXXXXXXXX",
                "scanDate": "2022-01-20T18:02:26 0100",
                "scanDurationSeconds": 28,

... ... ... 
'

It means that i'm now able to access converted data :

PS C:\temp> $b | convertfrom-json

scanSummary
-----------
@{scanEngine=2.0.2.7; scanHostname=XXXXXXXXX; scanDate=2022-01-20T18:02:26 0100; scanDurationSeconds=28; scanErrorCount=54; scanStatus=Partially Successful; scann...

At the moment $a type is Object[] , $b type is String.

So i tried to convert $a to string

PS C:\temp> $a = [string] $a
PS C:\temp> $a
{      "scanSummary": {          "scanEngine": "2.0.2.7",          "scanHostname": "XXXXXXXXX",          "scanDate": "2022-01-20T18:02:26 0100",          "scanDurationSeconds": 28,          "scanErrorCount": 54,          "scanStatus": "Partially Successful",          "scannedFiles": 649020,          "scannedDirectories": 209514,          "scannedJARs": 31,          "scannedWARs": 0,          "scannedEARs": 0,          "scannedPARs": 0,          "scannedTARs": 5,          "scannedCompressed": 43,          "vulnerabilitiesFound": 1      },      "scanDetails": [          {              "file": "XXXXX.jar",              "manifestVendor": "Unknown",              "manifestVersion": "Unknown",              "detectedLog4j": true,              "detectedLog4j1x": true,              "detectedLog4j2x": false,              "detectedJNDILookupClass": false,              "detectedLog4jManifest": false,              "log4jVendor": "log4j",              "log4jVersion": "1.2.17",              "cve20214104Mitigated": false,              "cve202144228Mitigated": true,              "cve202144832Mitigated": true,              "cve202145046Mitigated": true,              "cve202145105Mitigated": true,              "cveStatus": "Potentially Vulnerable ( CVE-2021-4104: Found )"          }      ]  }

and then convert it from json, but it's a total mess

PS C:\temp> $a | convertfrom-json
convertfrom-json : Objet non valide passé, ':' ou '}' attendu. (2): {      "scanSummary": {          "scanEngine": "2.0.2.7",
       "scanHostname": "XXXXXX",          "scanDate":
"2022-01-20T18:02:26 0100",          "scanDurationSeconds": 28,          "scanErrorCount":
54,          "scanStatus": "Partially Successful",          "scannedFiles": 649020,
"scannedDirectories": 209514,          "scannedJARs": 31,          "scannedWARs": 0,

Finally, if i export any data to .json file, i can't open it with notepad or codium (every character = nul nul nul nul) whereas i can access it with get-content within powershell.

It seems there's some hidden characters or i don't know what, but i can't handle how to easily convert and access json data in my case.

Is there anything missing ?

Thanks a lot for your support guys !

EDIT 1 - if i save the output, i can't open the .json file correctly ,but Powershell seems to understand it well : enter image description here

CodePudding user response:

Log4jScanner.exe outputs Unicode.

There is a bug in PowerShell that causes the output from programs that send Unicode bytes to their STDOUT/STDERR streams to be mangled.

It's easy to confirm - when you run the command

Log4jScanner.exe /scan_directory C:\something /report_pretty > output.json

in cmd.exe, then output.json will be neat UTF-16:

0d 00 0a 00 7b 00 0d 00 0a 00 20 00 20 00 20 00  .␀.␀{␀.␀.␀ ␀ ␀ ␀
20 00 22 00 73 00 63 00 61 00 6e 00 53 00 75 00   ␀"␀s␀c␀a␀n␀S␀u␀
6d 00 6d 00 61 00 72 00 79 00 22 00 3a 00 20 00  m␀m␀a␀r␀y␀"␀:␀ ␀

But PowerShell will blindly assume a single-byte encoding for the program's output stream, and encode that as UTF-16 again, including the NUL bytes which actually belong to UTF-16 characters:

ff fe 0d 00 0a 00 00 00 0d 00 0a 00 00 00 7b 00  ÿþ.␀.␀␀␀.␀.␀␀␀{␀
00 00 0d 00 0a 00 00 00 0d 00 0a 00 00 00 20 00  ␀␀.␀.␀␀␀.␀.␀␀␀ ␀
00 00 20 00 00 00 20 00 00 00 20 00 00 00 22 00  ␀␀ ␀␀␀ ␀␀␀ ␀␀␀"␀

Here we see the UTF-16 BOM (ff fe) and then a real NUL character 00 00 is inserted at every spot where there was a NUL in the original output, except for line breaks, which is why we still see the regular \r\n (0d 00 0a 00). For example, a space (20 00 in UTF-16) will become 20 00 00 00, and appear as a space plus a NUL in a text editor, as you have seen in Notepad .

This is of course horrible.

Your options are:

  • Run Log4jScanner.exe from cmd.exe
  • Remove the excess NUL characters from the output before parsing it

The latter would go like this:

$json = Log4jScanner.exe /scan_directory C:\something /report_pretty
$data = $json.Replace(([char]0).ToString(), "") | ConvertFrom-Json

.NET strings can legally contain NUL characters (C strings for example can not), but there is no legal NUL character in the JSON output we expect from the program, this is why throwing them all out works, but it's certainly not pretty - and it will only work for program output that does not actually contain Unicode characters (which happens to be the case here, all the characters in the JSON are in the ASCII range).

  •  Tags:  
  • Related