Home > OS >  PHP json_decode not working as expected if there is \x14
PHP json_decode not working as expected if there is \x14

Time:10-04

I wanted to json_decode the variable $string, but it didn't return anything, here is the code:

$string = '[{"name":"hello", "weird":"\x14"}]'; // this is not working - \x14 causing the problem
$json= json_decode($string, true);
echo "<pre>";
print_r($json);

but

$string = '[{"name":"hello", "weird":"something normal"}]'; // this is working

So i have tried to echo \x14 in online playground for PHP and it turned out to be a square box :

<?php
$ReceivedData = "\x14";
echo $ReceivedData; // returns => 

How can I json_decode this string?

CodePudding user response:

As @svgta says, the format isn't correct. In JSON, it would normally be "\u0014", for Unicode and correct UTF-8 support. This can effectively be validated by calling PHP's json_encode() function:

The test case in PHP:

<?php

$var = [
    (object)[
        'name' => 'hello',
        'weird' => "\x14", // PHP syntax https://www.php.net/manual/en/language.types.string.php#:~:text=g. "\400" === "\000")-,\x,-[0-9A-Fa
    ]
];

print '$var = ' . var_export($var, true) . "\n";

$json = json_encode($var);

print '$json = ' . $json . "\n";

$decoded_json = json_decode($json);

print '$decoded_json = ' . var_export($decoded_json, true) . "\n";

Output (the squares don't seem to display here but they do in the console):

$var = [
  0 => (object)[
     'name' => 'hello',
     'weird' => '',
  ],
]
$json = [{"name":"hello","weird":"\u0014"}]
$decoded_json = [
  0 => (object)[
     'name' => 'hello',
     'weird' => '',
  ],
]

You can run it online here to see by yourself: https://onlinephp.io/c/e2f69

Solving your problem

You'll probably have to convert your \x14 to \u0014 before calling json_decode(). This could be done with a regular expression \\x([0-9A-Fa-f]{2}): https://regex101.com/r/4Rxh4u/1

In PHP code:

<?php

// The invalid JSON.
$json = '[{"name":"hello", "weird":"\\x14", "other_weird":"\\xab"}]';

// The regular expression to find all \x## values: \\x([0-9A-Fa-f]{2})
// I did not handle \x# with only one hexa value but this could be improved.
$regexp = '/\\\\x([0-9A-Fa-f]{2})/';

// The replacement is \u#### (4 hexa values).
$subst = '\\\\u00$1';

$corrected_json = preg_replace($regexp, $subst, $json);

print "\$json = $json\n";
print "\$corrected_json = $corrected_json\n";

$var = json_decode($corrected_json);

print '$var = ' . var_export($var, true) . "\n";

Output:

$json = [{"name":"hello", "weird":"\x14", "other_weird":"\xab"}]
$corrected_json = [{"name":"hello", "weird":"\u0014", "other_weird":"\u00ab"}]
$var = [
  0 => (object)[
     'name' => 'hello',
     'weird' => '',
     'other_weird' => '«',
  ],
]

Run it here: https://onlinephp.io/c/35b8a

  • Related