Home > OS >  get char out of uint64_t in c
get char out of uint64_t in c

Time:12-12

so like title, im not sure how to get a char(in type "char", not just a byte but in the same type).For example, from a uint64_t?

I guess a type cast wont work?

Thanks a lot!

CodePudding user response:

The thing is, a char in C is only one byte, and therefore can mainly represent ASCII characters. If you character is unicode, it simply cannot be converted to char. If you do want to be able to be able to store unicode characters, you should use some other type, such as wchar_t (unicode, compiler-dependant size) or char16_t (utf16 character, can still not represent some characters such as emojis and other 4-byte characters), or even char32_t.

Either way, a simple cast should work, so far as you either use ASCII or unicode.

Note: Either way, the compiler will warn you that you may lose data in the proses, as uint64_t can store more values that existing characters and therefore is larger than any character type.

CodePudding user response:

So if you want to get char value if your int64_t < 255 you can try casting it first to uint8_t like this :

printf("%c", (int8_t)var);

Else if you need every char in the uint64_t you can try :

void int64ToChar(char mesg[], int64_t num) {
    for(int i = 0; i < 8; i  ) mesg[i] = num >> (8-1-i)*8;
}

CodePudding user response:

It depends on how the character — or characters — got into the uint64_t value in the first place.

If you say

uint64_t uu = 0x41;

then uu contains the ASCII value of a single character, and it's trivial to pull it back out. You don't even need a cast:

char c = uu;
printf("%c\n", c);      /* prints "A" */

Of course, since it's 64 bits wide, a uint64_t can theoretically have up to eight 8-bit ASCII characters jammed into it:

uu = 0x48656c6c6f;      /* "Hello" in hex */

If so, you can extract individual characters using some bit manipulation:

c = (uu >> 24) & 0xff;
printf("%c\n", c);      /* prints "e" */

Finally, since uint64_t is wider than 8 bits, it can also contain Unicode characters. For example, I could write:

uu = 0x03A3;            /* U 03A3 Greek Capital Letter Sigma */

But now there's no way to extract that as a plain char, or print it using %c. I'd have to use a wchar_t, and %lc:

wchar_t wc = uu;
printf("%lc\n", wc);    /* might print "Σ" */

Note that besides using wchar_t and %lc, this last works only if the output device is Unicode-capable, and if the "locale" is set up properly.

Theoretically we could cram up to four UTF-16-encoded Unicode characters, or two UTF-32-encoded characters, into a single uint64_t value, but that's getting increasingly speculative or whimsical if not downright crazy, so I'm not going to bother trying to demonstrate it.

  • Related