Home > Net >  Format overflow warning when trying to store a wide string
Format overflow warning when trying to store a wide string

Time:09-20

I'm currently learning C and lately, I have been focusing on the topic of character encoding. Note that I'm a Windows programmer. While I currently test my code only on Windows, I want to eventually port it to Linux and macOS, so I'm trying to learn the best practices right now.

In the example below, I store a file path in a wchar_t variable to be opened later on with _wfopen. I need to use _wfopen because my file path may contain chars not in my default codepage. Afterwards, the file path and a text literal is stored inside a char variable named message for further use. My understanding is that you can store a wide string into a multibyte string with the %ls modifier.

char message[8094] = "";
wchar_t file_path[4096] = L"C:\\test\\test.html";
sprintf(message, "Accessing: %ls\n", file_path);

While the code works, GCC/MinGW outputs the following warning and notes:

warning: '%ls' directive writing up to 49146 bytes into a region of size 8083 [-Wformat-overflow=]|
note: assuming directive output of 16382 bytes|
note: 'sprintf' output between 13 and 49159 bytes into a destination of size 8094|

My issue is that I simply do not understand how sprintf could output up to 49159 bytes into the message variable. I output the Accessing: string literal, the file_path variable, the \n char and the \0 char. What else is there to output?

Sure, I could declare message as a wchar_t variable and use wsprintf instead of sprintf, but my understanding is that wchar_t does not make up for nice portable code. As such, I'm trying to avoid using it unless it's required by a specific API.

So, what am I missing?

CodePudding user response:

The warning doesn't take into account the actual contents of file_path , it is calculated based on file_path having any possible content . There would be an overflow if file_path consisted of 4095 emoji and a null terminator.

Using %ls in narrow printf family converts the source to multi-byte characters which could be several bytes for each wide character.

To avoid this warning you could:

  • disable it with -Wno-format-overflow
  • use snprintf instead of sprintf

The latter is always a good idea IMHO, it is always good to have a second line of defence against mistakes introduced in code maintenance later (e.g. someone comes along and changes the code to grab a path from user input instead of hardcoded value).


After-word. Be very careful using wide characters and printf family in MinGW , which implements the printf family by calling MSVCRT which does not follow the C Standard. Further reading

To get closer to standard behaviour, use a build of MinGW-w64 which attempts to implement stdio library functions itself, instead of deferring to MSVCRT. (E.g. MSYS2 build).

  • Related