I am trying to decode an UTF-8 encoded json string with Cpanel::JSON::XS:
use strict;
use warnings;
use open ':std', ':encoding(utf-8)';
use utf8;
use Cpanel::JSON::XS;
use Data::Dumper qw(Dumper);
my $str = '{ "title": "Outlining — How to outline" }';
my $hash = decode_json $str;
#my $hash = Cpanel::JSON::XS->new->utf8->decode_json( $str );
print Dumper($hash);
but this throws an exception at decode_json
:
Wide character in subroutine entry
I also tried Cpanel::JSON::XS->new->utf8->decode_json( $str )
(see commented out line), but this gives another error:
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)")
What am I missing here?
CodePudding user response:
decode_json
expects UTF-8, but you are providing decoded text (a string of Unicode Code Points).
Use
use utf8;
use Encode qw( encode_utf8 );
my $json_utf8 = encode_utf8( '{ "title": "Outlining — How to outline" }' );
my $data = decode_json( $json_utf8 );
or
use utf8;
my $json_utf8 = do { no utf8; '{ "title": "Outlining — How to outline" }' };
my $data = decode_json( $json_utf8 );
or
use utf8;
my $json_ucp = '{ "title": "Outlining — How to outline" }';
my $data = Cpanel::JSON::XS->new->decode( $json_ucp ); # Implied: ->utf8(0)
(The middle one seems hackish to me. The first one might be used if you get data from multiple source, and the others provide it encoded.)