I wanted to print a single char in a new line, but get '?' in terminal with non-ASCII symbols. How can I avoid this? I checked information in the internet, but didn't succeed. Thanks in advance!
char buff[255] = "хай, man";
int slen1 = strlen(buff);
printf("%s\n", buff);
for (int i=0; i < slen1; i ) {
printf("%c\n", buff[i]);
}
It happens if I use %c with any other symbol.
CodePudding user response:
Since we know the string is UTF-8 encoded, we could process the string with custom code.
Note this is error prone. Below also lacks error checking.
#include <ctype.h>
#include <stdlib.h>
#include <stdio.h>
int main(void) {
char buff[255] = u8"хай, man";
printf("buff[] <%s>\n", buff);
size_t len = strlen(buff);
printf("string length %zu\n", len);
for (size_t i = 0; i<=len; i ) {
printf("%zu X %c\n", i, 0xFFu & (unsigned) buff[i],
isprint((unsigned char) buff[i]) ? buff[i] : '?' );
}
puts("");
for (size_t i = 0; i<len; i ) {
printf("%zu ", i);
char ch = buff[i];
// If ASCII character ....
if ((ch & 0x80) == 0) {
printf("%c\n", ch);
} else {
// Process UTF-8
char b[5] = { ch };
size_t j;
for (j = 1; (j < 4) && ((buff[i j] & 0xC0) == 0x80); j ) {
b[j] = buff[i j];
}
b[j] = 0;
printf("%s\n", b); // Print 1 UTF-8 character.
i = j - 1;
}
}
return 0;
}
Output
buff[] <хай, man>
string length 11
0 D1 ?
1 85 ?
2 D0 ?
3 B0 ?
4 D0 ?
5 B9 ?
6 2C ,
7 20
8 6D m
9 61 a
10 6E n
11 00 ?
0 х
2 а
4 й
6 ,
7
8 m
9 a
10 n