Home > Back-end >  How to use in output a single utf-8 char with any other symbol?
How to use in output a single utf-8 char with any other symbol?

Time:04-21

I wanted to print a single char in a new line, but get '?' in terminal with non-ASCII symbols. How can I avoid this? I checked information in the internet, but didn't succeed. Thanks in advance!

char buff[255] = "хай, man";
int slen1 = strlen(buff);
printf("%s\n", buff);
for (int i=0; i < slen1; i  ) {
    printf("%c\n", buff[i]);
}

It happens if I use %c with any other symbol.

Output

CodePudding user response:

Since we know the string is UTF-8 encoded, we could process the string with custom code.
Note this is error prone. Below also lacks error checking.

#include <ctype.h>
#include <stdlib.h>
#include <stdio.h>

int main(void) {
  char buff[255] = u8"хай, man";
  printf("buff[] <%s>\n", buff);
  size_t len = strlen(buff);
  printf("string length %zu\n", len);
  for (size_t i = 0; i<=len; i  ) {
    printf("%zu X %c\n", i, 0xFFu & (unsigned) buff[i],
        isprint((unsigned char) buff[i]) ? buff[i] :  '?' );
  }
  puts("");

  for (size_t i = 0; i<len; i  ) {
    printf("%zu ", i);
    char ch = buff[i];
    // If ASCII character ....
    if ((ch & 0x80) == 0) {  
      printf("%c\n", ch);
    } else {
      // Process UTF-8
      char b[5] = { ch };
      size_t j;
      for (j = 1; (j < 4) && ((buff[i j] & 0xC0) == 0x80); j  ) {
        b[j] = buff[i j];
      }
      b[j] = 0;
      printf("%s\n", b);  // Print 1 UTF-8 character.
      i  = j - 1;
    }
  }
  return 0;
}

Output

buff[] <хай, man>
string length 11
0 D1 ?
1 85 ?
2 D0 ?
3 B0 ?
4 D0 ?
5 B9 ?
6 2C ,
7 20  
8 6D m
9 61 a
10 6E n
11 00 ?

0 х
2 а
4 й
6 ,
7  
8 m
9 a
10 n
  • Related