Home > OS >  Problems converting wide string to multibyte in c
Problems converting wide string to multibyte in c

Time:11-25

I'm trying to convert a path from a wstring to a string. But when I do it outputs garbage. Is there a way to get these japanese characters into a string?

wstring foo = L"C:\\projects\\サービス\\b2";
char bar[256] = { 0 };
string baz( bar );

int len = WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), -1, NULL, 0, 0, 0 );
WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), -1, &bar[0], len, 0, 0 );

when debugging on windows (vs2019) baz contains: "C:\projects\サービス\b2"

debugging view

CodePudding user response:

As @Codo explained it was actually the correct data. And as @Richard Critten tried to explain, I needed to check the hex, not just the decimals.

For others encountering similar problems I wanted to do this so that later I could test the string to see if the directory exists. Here's how I did that:

Since I was using stat, it didn't like the "unicode" string. So I had to convert back to a wstring using MultiByteToWideChar and use the wide string version: _wstat.

So assuming path is actually wchars:

int wchars_num = MultiByteToWideChar( CP_UTF8, 0, path, -1, NULL, 0 );
wchar_t *wpath = new wchar_t[wchars_num];
MultiByteToWideChar( CP_UTF8, 0, path, -1, wpath, wchars_num );

struct _stat64i32 wInfo;
statRC = _wstat( wpath, &wInfo );
if ( statRC != 0 )
{
    if ( errno == ENOENT ) { return false; } // something along the path does not exist
    if ( errno == ENOTDIR ) { return false; } // something in path prefix is not a dir
    return false;
}
else if ( wInfo.st_mode & S_IFDIR )
{
    return true;
}

CodePudding user response:

You are converting properly, but not viewing the string correctly in the debugger. The default when viewing the char* in Visual Studio is to assume an ANSI encoding specific to the localized version of Windows used. In Western European/US Windows, that would be Windows-1252.

At least as of Visual Studio 2015, use the s8 format code in the Watch Window to view UTF-8-encoded strings:

main() with OP code viewing variables with and without s8 format code in watch window

See Format specifiers in the debugger

CodePudding user response:

You are not allocating any memory for the target string. You are creating the string from an empty char[] first, THEN you are converting into that same char[], which will NOT update the string automatically.

You don't need the char[] at all. Try this instead:

wstring foo = L"C:\\projects\\サービス\\b2";
string baz;

int len = WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), foo.size(), NULL, 0, 0, 0 );
baz.resize(len);
WideCharToMultiByte( CP_UTF8, 0, foo.c_str(), foo.size(), &baz[0]/*or: baz.data()*/, len, 0, 0 );
  • Related