Home > Software engineering >  Strange interaction between setlocale and mbstowcs
Strange interaction between setlocale and mbstowcs

Time:10-31

I have a strange behaviour between setlocale and mbstowcs.

Here is a sample code :

#include <cstdlib>
#include <iostream>
#include <clocale>

int main()
{
    std::setlocale(LC_CTYPE, "");
    char * cur_ctype_locale = std::setlocale(LC_CTYPE, NULL); // line 1
    std::cout << cur_ctype_locale << std::endl; // line 2
    std::string src = "éèùç";
    size_t result_size = std::mbstowcs(NULL, &src[0], 0);
    if (result_size == (size_t)-1)
    {
        std::cout << "failed" << std::endl;
        return 1;
    }
    std::wstring result;
    result.resize(result_size   1);
    result_size = std::mbstowcs(&result[0], &src[0], result_size   1);
    std::wcout << result << std::endl;
    return 0;
}

When executed (on linux), the output is garbage.

When I remove the lines commented as "line 1" and "line 2" the output is correct (I see the string as defined in the sources).

As far as I read on the documentation of setlocale:

If locale is NULL, the current locale is only queried, not modified.

The lines commented as "line 1" and "line 2" should only return the current locale for LC_CTYPE and not modify the locale.

Am I missing something here ?

Thank you for your attention.

CodePudding user response:

I'm running Archlinux and my the output of my locale:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

And running this code:

#include <string.h>
#include <errno.h>
#include <cstdlib>
#include <iostream>
#include <clocale>

int main()
{
    std::setlocale(LC_CTYPE, "");
    char * cur_ctype_locale = std::setlocale(LC_CTYPE, NULL); // line 1
    std::cout << cur_ctype_locale << std::endl; // line 2
    std::string src = "éèùç";
    size_t result_size = std::mbstowcs(NULL, &src[0], 0);
    if (result_size == (size_t)-1)
    {
        std::cout << "failed" << std::endl;
        return 1;
    }
    std::wstring result;
    result.resize(result_size   1);
    result_size = std::mbstowcs(&result[0], &src[0], result_size   1);
    if (result_size < 0) {
        std::cout << strerror(errno) << std::endl;
        printf("You error'd\n");
    } else {
        printf("You were fine...\n");
    }

    std::wcout << result << std::endl;
    return 0;
}

And the output:

en_US.UTF-8
You were fine...

If you can provide some information say your output from running locale?

Other than that as you say by calling setlocale with NULL you're strictly querying at this point.

CodePudding user response:

std::cout <<
std::wcout <<

It's either one or the other, you can't use both.

  • Related