Weird typecasting behaviour in unistd.h read()-CodePudding

Consider the following C program:

#include <stdio.h>
#include <unistd.h>

int main()
{
    char *buf[100] = {0};
    __int32_t buflen = 0x80000000;
    size_t len = read(0, buf, buflen);
    printf("%d", len);
}

When compiling this with gcc 12.2.0 I get the warning messages:

<snip>
foo.c:8:18: warning: ‘read’ specified size 18446744071562067968 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
    8 |     size_t len = read(0, buf, buflen);
<snip>

and read returns -1 i.e. an error at runtime.

What I don't understand about this: I build the lowest (?) signed int32. Then this signed i32 gets passed to read(..., ..., size_t buflen), where size_tis an unsigned integer. Thus read should "interpret" buflen as a zero-padded size_t i.e. 0x00_00_00_00_80_00_00_00 which is exactly what happens when I manually cast buflen to size_t.

Why does it exeed the buffer size then and where does the 18446744071562067968 (something close to the max value of size_t) come from?

A bit of context:

Yes I am quite aware that this will overflow the buffer.
This is supposed to be bad code that might get exploited.

I tried several values for buflen with sometimes inconsistent behavior. I assume this is some u.b.. I expect read to interpret the passed parameter as 0x80000000

Edit: buflen gets extended to 0xFFFFFFFF80000000. But why?

CodePudding user response：

ssize_t read(int fd, void *buf, size_t count);

Your size_t is 64 bits long. (int32_t)0x80000000 == -2147483648. When you convert this negative value to 64 bit version it is getting signed extended. 64 bit version of -2147483648 is 0xffffffff80000000.

It shows how important is to use the correct types. Instead int32_t use the correct size_t type.

Do noy use internal __intxx_t types. Use standard (defined in stdint.h) intxx_t types

Side notes:

char *buf[100] defined an array of 100 pointers to char. I do not think that it is exactly what you want.
You pass much smaller buffer than the max read size. It invokes undefined behaviour

CodePudding user response：

The read function expects a size_t type for the last parameter (i.e. your buflen). However, you give it an object of type int32_t. So there will be a conversion from int32_t to size_t.

For such conversions the C standard says:

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

So the question is: Can your value of buflen be represented in an object of type size_t?

You assigned the value 0x80000000 so let's check what that is:

int32_t buflen = 0x80000000;
printf("%" PRId32 "\n", buflen);

Output:

-2147483648

oh... a negative value... Since size_t can't represent negative values the rule above doesn't apply in your case. We need another rule from the standard. It goes:

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

With the note that:

The rules describe arithmetic on the mathematical value, not the value of a given type of expression

As size_t is unsigned this rule applies. So let's find the maximum for size_t

printf("%zu\n", SIZE_MAX);

Output:

18446744073709551615

So according to the standard we need to do:

           -2147483648
  18446744073709551615
                     1
  --------------------
  18446744071562067968
  ====================

The value 18446744071562067968 can be represented in a size_t object so that will be the value passed to read

BTW:

Here

printf("%d", len);

you print a size_t object using %d. That is wrong (undefined behavior). Use %zu for size_t

That said... the return value from read is not a size_t but a ssize_t so the type is wrong from the start.