Home > Software engineering >  How to write Unicode symbol manually into array C
How to write Unicode symbol manually into array C

Time:10-30

Why this code doesn't print А?

int main() {
    char str[] = {0x0, 0x4, 0x1, 0x0};
    write(1, str, 4);
}

Instead of A it just print nothing and exit. This is strange because hexadecimal value of A is U 0410.

CodePudding user response:

Follow this answer https://stackoverflow.com/a/6240184/14926026, you will see that the cyrillic A is not {0x0, 0x4, 0x1, 0x0}, but actually { 0xd0, 0x90 }

int main()
{
   char str[] = { 0xd0, 0x90 };
   write(1, str, 2);
}

CodePudding user response:

Your post contains both

U 000041 LATIN CAPITAL LETTER A (A)

and

U 000410 CYRILLIC CAPITAL LETTER A (А)

Either way, you need to encode the character using the encoding expected by the terminal. Assuming a terminal expecting UTF-8,

$ perl -e'use utf8; $_ = "A";         utf8::encode($_); printf "%v02X", $_;'
41

$ perl -e'use utf8; $_ = "\N{U 41}";  utf8::encode($_); printf "%v02X", $_;'
41

$ perl -e'use utf8; $_ = chr(0x41)";  utf8::encode($_); printf "%v02X", $_;'
41

$ perl -e'use utf8; $_ = "А";         utf8::encode($_); printf "%v02X", $_;'
D0.90

$ perl -e'use utf8; $_ = "\N{U 410}"; utf8::encode($_); printf "%v02X", $_;'
D0.90

$ perl -e'use utf8; $_ = chr(0x410);  utf8::encode($_); printf "%v02X", $_;'
D0.90

So you want

const char *str = "\x41";      // { 0x41, 0 }
printf("%s\n", str);           // write(1, str, 1);

or

const char *str = "\xD0\x90";  // { 0xD0, 0x90, 0 }
printf("%s\n", str);           // write(1, str, 2);

(No point in using write, but you could.)

  • Related