Home > front end >  PHP read file with Japanese contents
PHP read file with Japanese contents

Time:06-26

I'm writing a php script in which I need to data from a CSV file in which some of the contents are written in Japanese. However, I can't get the data to read or display correctly at all.

The file I'm reading is encoded in the iso-8859-1 charset. I also tried using iconv to convert it to a UTF-8 encoded file however doing that seemed to break the data in the file entirely, and the text wouldn't display correctly in any applications afterwards.

Here's the script I'm using right now:

<?php 
    header("Content-Type: text/html; charset=ISO-8859-1"); 
    setlocale(LC_ALL, 'ja_JP.EUC-JP'); 
?>

<!DOCTYPE html>
<html lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
    <?php

        $row = 1;

        if (($handle = fopen("/srv/http/Japanese/testFile.csv", "r")) !== FALSE) {
            while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
                $row  ;
                for ($i = 0; $i < 4;   $i) {
                    echo $data[$i] . "<br />";
                }
                echo "<br />";
                if ($row > 1000) break;
            }
            fclose($handle);
        } else echo print_r(error_get_last(),true);
    ?>
</body>
</html>

The first two lines of PHP were added to try to fix the issue but it hasn't worked.

The output for a string in the file reading 引き込む, 762, 762, 7122 comes out looking like this:

°ú¤­¹þ¤à
762
762
7122

Also, it doesn't seem to be an issue solely with the display of the data. I also tried testing the data with if ($data[$i]) == "引き込む") and it seems to be false even when I do know that's the string being read.

I've also tried using other means of reading files, however no matter which PHP method I'm using to read the file I seem to get the exact same issue.

Any help would be greatly appreciated.

CodePudding user response:

I wanted to comment but I dont' have points so please forgive me if my answer is incorrect

From what i can find on google and Stackoverflow this seems to be a solution you just have to fit it into you code

This code

setlocale(LC_ALL, 'ja_JP');
$data = array_map('str_getcsv', file('japanese.csv'));
var_dump($data);

works with the following CSV file (japanese.csv, saved in UTF-8) on my local.

日本語,テスト,ファイル
2行目,CSV形式,エンコードUTF-8

The results are

array(2) {
  [0]=>
  array(3) {
    [0]=>
    string(9) "日本語"
    [1]=>
    string(9) "テスト"
    [2]=>
    string(12) "ファイル"
  }
  [1]=>
  array(3) {
    [0]=>
    string(7) "2行目"
    [1]=>
    string(9) "CSV形式"
    [2]=>
    string(20) "エンコードUTF-8"
  }
}

this might help you understand more: Like to other post

CodePudding user response:

You need to either convert the csv file with iconv to ja_JP.EUC-JP (and set the charset value in the meta tag to this value too) or convert the csv to utf8 and set an appropriate charset (ja_JP.UTF8).

  • Related