I'm trying to convert a path from a wstring to a string. But when I do it outputs garbage. Is there a way to get these japanese characters into a string?
wstring foo = L"C:\\projects\\サービス\\b2";
char bar[256] = { 0 };
string baz( bar );
int len = WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), -1, NULL, 0, 0, 0 );
WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), -1, &bar[0], len, 0, 0 );
when debugging on windows (vs2019) baz contains: "C:\projects\サービス\b2"
CodePudding user response:
As @Codo explained it was actually the correct data. And as @Richard Critten tried to explain, I needed to check the hex, not just the decimals.
For others encountering similar problems I wanted to do this so that later I could test the string to see if the directory exists. Here's how I did that:
Since I was using stat, it didn't like the "unicode" string. So I had to convert back to a wstring using MultiByteToWideChar and use the wide string version: _wstat.
So assuming path is actually wchars:
int wchars_num = MultiByteToWideChar( CP_UTF8, 0, path, -1, NULL, 0 );
wchar_t *wpath = new wchar_t[wchars_num];
MultiByteToWideChar( CP_UTF8, 0, path, -1, wpath, wchars_num );
struct _stat64i32 wInfo;
statRC = _wstat( wpath, &wInfo );
if ( statRC != 0 )
{
if ( errno == ENOENT ) { return false; } // something along the path does not exist
if ( errno == ENOTDIR ) { return false; } // something in path prefix is not a dir
return false;
}
else if ( wInfo.st_mode & S_IFDIR )
{
return true;
}
CodePudding user response:
You are converting properly, but not viewing the string correctly in the debugger. The default when viewing the char*
in Visual Studio is to assume an ANSI encoding specific to the localized version of Windows used. In Western European/US Windows, that would be Windows-1252.
At least as of Visual Studio 2015, use the s8
format code in the Watch Window to view UTF-8-encoded strings:
See Format specifiers in the debugger
CodePudding user response:
You are not allocating any memory for the target string
. You are creating the string
from an empty char[]
first, THEN you are converting into that same char[]
, which will NOT update the string
automatically.
You don't need the char[]
at all. Try this instead:
wstring foo = L"C:\\projects\\サービス\\b2";
string baz;
int len = WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), foo.size(), NULL, 0, 0, 0 );
baz.resize(len);
WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), foo.size(), &baz[0]/*or: baz.data()*/, len, 0, 0 );