Home > Software design >  How to use ReadFile with wchar_t?
How to use ReadFile with wchar_t?

Time:09-17

Consider the following two function, the first one uses the windows api functions ReadFile and CreateFileW whereas the second function uses fopen and fgetws to read a non english text from a file called data.txt the first function outputs garbage text where as the second function outputs the text from the data.txt without any problems

notice that fopen has ccs=UTF-8 that defines what character encoding to use whereas read_file_2 does not have something similar

DWORD read_file_2()
{
    wchar_t wstr[512];
    BOOL success = FALSE;
    DWORD dwRead, total =0;
    HANDLE handle = CreateFileW(L"data.txt",
                                GENERIC_READ,
                                0,
                                NULL,
                                3,
                                FILE_ATTRIBUTE_NORMAL,
                                NULL);
    if (handle == INVALID_HANDLE_VALUE)
        return -1;
    do
    {   
        success = ReadFile(handle, wstr, 20, &dwRead, NULL);
        total  = dwRead;
    } while(!success || dwRead == 0);

    wstr[total] = L'\0';
    wprintf(L"%ls\n",wstr);
    return 0;
}

void read_file_1()
{
    wchar_t converted[20];
    FILE * ptr;view=msvc-170
    ptr = fopen("data.txt", "rt ,ccs=UTF-8");
    fgetws(converted, 20, ptr);
    wprintf(L"%ls\n", converted);
    fclose(ptr);
}

int main()
{
    _setmode(fileno(stdin), _O_U8TEXT);
    _setmode(fileno(stdout), _O_U8TEXT);
    read_file_1();
    read_file_2();
}

how does one use ReadFile to read a wchar_t from a text file and output it to terminal without turning it into garbage text

 Шифрование.txt  ال
퀠킨톸톄킀킾킲킰킽킸♥

actual content of data.txt

 Шифрование.txt  العربية.txt

CodePudding user response:

You can use MultiByteToWideChar.

int total_wchars = MultiByteToWideChar(
   CP_UTF8,       // CodePage
   0,             // dwFlags
   bytes,         // lpMultiByteStr
   total_bytes,   // cbMultiByte
   NULL,          // lpWideCharStr
   0              // cchWideChar     0 = Get size incl NUL.
);

if ( total_wchars == 0 ) {
   // Error. Use GetLastError() and such.
   ...
}

LPWSTR wchars = malloc( total_wchars * sizeof( *wchars ) );

MultiByteToWideChar(
   CP_UTF8,       // CodePage
   0,             // dwFlags
   bytes,         // lpMultiByteStr
   total_bytes,   // cbMultiByte
   wchars,        // lpWideCharStr
   total_wchars   // cchWideChar
);

Note that if the compiler has wchar_t,

  • WCHAR is wchar_t
  • LPWSTR is wchar_t *
  • LPCWSTR is const wchar_t *

CodePudding user response:

The problem is that ReadFile doesn't read strings or even characters. It reads bytes.

Since it doesn't read strings, it also don't null-terminate the data like a string.

You need to make sure that it reads enough bytes, and to null-terminate the array if it's a string.


By using a loop you have a good start, but your loop overwrites what was read last iteration of the loop, making you loose the data.

You need to pass a pointer to the end of buffer in the loop.

And as I already mentioned in a comment, make sure that the loop works properly (and not go into an infinite loop if there's an error, for example).

  • Related