Home > Back-end >  php - Parsing Chinese text to json
php - Parsing Chinese text to json

Time:10-24

txt I have looks like :

紐約建築藝術 陳偉銘 藝術 2016/02/15 在館內
人體百科全書 蘇煥文 科學 2017/09/30 已借出
塞納河畔 葉國威 文學 2017/09/25 已預約
性別與教育 陳文輝 社會學 2016/10/12 已借出
台灣當代社會變革 林東興 社會學 2014/04/17 已借出

and I want to output a json array file looks like :

{
"books":[
 {"title":" 紐 約 建 築 藝 術 ", "author":" 陳 偉 銘 ", "type":" 藝 術 ", 
"publishDate":"2016/02/15", "status":"在館內"}, 
 {"title":" 人 體 百 科 全 書 ", "author":" 蘇 煥 文 ", "type":" 科 學 ", 
"publishDate":"2017/09/30", "status":"已借出"}, 
 {"title":" 塞納河畔 ", "author":" 葉國威 ", "type":" 文 學 ", 
"publishDate":"2017/09/25", "status":"已預約"},
 {"title":" ... ", "author":" ... ", "type":" ... ", 
"publishDate":"...", "status":"..."}
]}

and this is my code :

<?php
// Open the file to read data.
$fh = fopen('Book.txt', 'r');
// define an eampty array
$data = array();
// read data
while ($line = fgets($fh)) {
    if (trim($line) != '') {
        $line_data = explode('  ', $line);
        $data[] = array('title' => trim($line_data[0]), 'author' => trim($line_data[1]), 'type' => trim($line_data[2]), 'publishDate' => trim($line_data[3]), 'status' => trim($line_data[4]));
    }
}
fclose($fh);
echo $json_data = json_encode($data);
?>

then it output (Not sure how to turn unicode back to chinese)

[{"title":"\ufeff\u7d10\u7d04\u5efa\u7bc9\u85dd\u8853","author":"\u9673\u5049\u9298","type":"\u85dd\u8853","publishDate":"2016\/02\/15","status":"\u5728\u9928\u5167"},{"title":"\u6a5f\u5668\u5b78\u7fd2-\u4f7f\u7528Python\u8a9e\u8a00","author":"\u5f35\u82b3\u6797","type":"\u5de5\u7a0b","publishDate":"2018\/03\/29","status":"\u5728\u9928\u5167"},{"title":"\u53f0\u7063\u7576\u4ee3\u793e\u6703\u8b8a\u9769","author":"\u6797\u6771\u8208","type":"\u793e\u6703\u5b78","publishDate":"2014\/04\/17","status":"\u5df2\u501f\u51fa"}]

CodePudding user response:

json_encode escapes unicode by default, which is what Chinese is. Add JSON_UNESCAPED_UNICODE to your json_encode to output unescaped unicode.

json_encode($data, JSON_UNESCAPED_UNICODE);

And to pretty print it multi line, you may combine that with JSON_PRETTY_PRINT.

json_encode($data, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT);
  • Related