Home > Software design >  PHP: How to configure windows shell codepage so that STDOUT of proc_open() does not garble?
PHP: How to configure windows shell codepage so that STDOUT of proc_open() does not garble?

Time:03-18

The following PHP code

$descriptorspec = array(1=>array('pipe', 'w'));
$cmd = escapeshellcmd('echo こんにちは');

// Change this line in the following snippets
$proc = proc_open($cmd, $descriptorspec, $pipes);

$res = null;

if (is_resource($proc))
{
    $res = stream_get_contents($pipes[1]);
    proc_close($proc);
}

echo $res;

outputs the gibberish $res = ����ɂ���.

I tried this solution by setting up $env_vars in proc_open(). That is, I substituted the third line of the above snippet with

$encoding = array('LANG'=>'ja_JP.utf-8');
$proc = proc_open($cmd, $descriptorspec, $pipes, null, $encoding);

Still outputs the garbled string $res = ����ɂ���.

Next, I tried to use setlocale() and putenv() as per this solution. The third line of the first snippet becomes

$encoding = 'ja_JP.UTF-8';
setlocale(LC_ALL, $encoding);
putenv('LC_ALL='. $encoding);

$proc = proc_open($cmd, $descriptorspec, $pipes);

Still outputs the gibberish $res = ����ɂ���...

Do you know what's wrong with the shell encoding config?

As a side note, I'm currently debugging my code on Visual Studio 2022 with IIS Express (Win10/11), but will eventually deploy my website on an Apache server.

Additional info:

  • I use IIS Express launched from Visual Studio 2022 and an external WAMPServer as debugging servers. Both output garbled results.
  • The version of PHP is 8.1.
  • The OS is Windows 11 Enterprise.
  • My PHP file is correctly saved in UTF-8 (without BOM)
  • Important note 1: the original code works in the PHP interactive shell opened from PowerShell 7.2 64bits.
  • Important note 2: the original code works on another computer using Windows 10 Home.

CodePudding user response:

TLTR

Use the PHP function sapi_windows_cp_conv as follows.

$res = stream_get_contents($pipes[1]);
$res = sapi_windows_cp_conv(sapi_windows_cp_get('ansi'), 65001, $res);

Long Answer

The solution refers to this SO answer. In fact, PHP communicates with the default command shell of windows (cmd.exe, pwsh.exe, ...), whose codepage might be set to ANSI instead of UTF-8.

First, try to modify the default codepage of cmd.exe following these steps. However, if encoding issues persist, you might need to look at the next alternative.

To enforce conversion from one codepage to another directly from PHP, use sapi_windows_cp_conv(sapi_windows_cp_get($kind), 65001, $res), where 65001 refers to UTF-8 encoding (see chcp). Please refer to the sapi_windows_cp_conv documentation here. Note that $kind needs to be specified as 'ansi' or 'oem' as per the documentation.

EDIT: To set ANSI/OEM of cmd/powershell to UTF-8 at the system level, check out these steps.

  • Related