Home > database >  Windows API ReadFile() skips one out of every two characters
Windows API ReadFile() skips one out of every two characters

Time:10-22

My aim is to read all the text located in a file. For some reason whenever I read from the file and print the result (drawText), the buffer seems to be skipping one character every two positions. HELLO will become HLO and SCAVENGER becomes SAEGR.

This is for Windows API. I wonder if CreateFile() and ReadFile() are just fine and whether it's something else causing the issue.

void init(HDC hdc)
{
    HANDLE hFile;
    LPCSTR fileName = "c:\\Users\\kanaa\\Desktop\\code\\HW2_StarterCode\\words.txt";
    hFile = CreateFileA(fileName, GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    DWORD dwFileSize = GetFileSize(hFile, NULL);
    DWORD dwBytesRead;
    WCHAR* buffer = new WCHAR[dwFileSize / 2   1];  
    buffer[dwFileSize / 2] = 0;
    bool read = ReadFile(hFile, buffer, dwFileSize, &dwBytesRead, NULL);
    std::wstring wstr(buffer);
    std::string str(wstr.begin(), wstr.end());
    delete[] buffer;
    CloseHandle(hFile);
    if (read) parse(str, hdc);
}

void parse(std::string word, HDC hdc)
{
    std::string to = word;
    std::wstring wword = std::wstring(to.begin(), to.end());
    const WCHAR* wcword = wword.c_str();
    Graphics graphics(hdc);
    drawText(&graphics, wcword);
}

CodePudding user response:

The problem was the WCHAR buffer. Below are the corrections

    CHAR* buffer = new CHAR[dwFileSize/sizeof(char)   1];  
    bool read = ReadFile(hFile, buffer, dwFileSize, &dwBytesRead, NULL);
    buffer[dwBytesRead] = 0;

CodePudding user response:

You are processing the file data using a wchar_t[] buffer. wchar_t is 2 bytes in size on Windows. So, in the statement:

std::string str(wstr.begin(), wstr.end());

You are iterating through the file data 2 bytes at a time, interpreting each byte pair as a single wchar_t that gets truncated to a 1-byte char, discarding the other byte. That is why your str ends up skipping every other character.

Process the file data using a char[] buffer instead. However, there are easier ways to read 7/8-bit file data into a std::string.

Lastly, in this statement:

std::wstring wword = std::wstring(to.begin(), to.end());

This is not the correct way to convert a std::string to a std::wstring. All you are doing is iterating through the chars converting each one as-is into a 2-byte wchar_t. Windows APIs expect wchar_t strings to be encoded in UTF-16, which your code is not converting to. You need to use MultiByteToWideChar(), std::wstring_convert, or other equivalent Unicode library call to perform that conversion. In which case, you first need to know the encoding of the source file in order to convert it to Unicode correctly.

  • Related