Home > Software engineering >  how does memccpy handle large integer values?
how does memccpy handle large integer values?

Time:05-19

according to man 3 memccpy the memccpy function is defined as follows:

SYNOPSIS

   #include <string.h>

   void *memccpy(void *dest, const void *src, int c, size_t n);

DESCRIPTION

The memccpy() function copies no more than n bytes from memory area src to memory area dest, stopping when the character c is found.

If the memory areas overlap, the results are undefined.

What confuses me is that memccpy copies n Bytes and stops if character c is found. However, the function takes int c as an argument. So what happens if I call memccpy with the following values:

memccpy(&x, &y, 0xffffff76, 100);

Here the value to check is to big for char and therefore this case shouldn't work?

CodePudding user response:

how exactly this case is handled in code

Just the value of the parameter is converted to a character:

void *memccpy(..., int param_c, ...) {
     unsigned char c = param_c;

In real life : https://github.com/lattera/glibc/blob/master/string/memccpy.c#L33 https://github.com/lattera/glibc/blob/master/string/memchr.c#L63 .

(On nowadays systems) unsigned char has 8 bits, (unsigned char)(int)0xffffff76 just becomes 0x76. The upper bits are just ignored.

CodePudding user response:

memccpy() is defined by POSIX.1-2001 (IEEE Std 1003.1-2001), which states:

SYNOPSIS

#include <string.h>

void *memccpy(void *restrict s1, const void *restrict s2,
       int c, size_t n);

DESCRIPTION

The memccpy() function shall copy bytes from memory area s2 into s1, stopping after the first occurrence of byte c (converted to an unsigned char) is copied, or after n bytes are copied, whichever comes first. If copying takes place between objects that overlap, the behavior is undefined.

So there you go, a simple unsigned char conversion takes place:

void *memccpy(void *restrict s1, const void *restrict s2, int c, size_t n) {
    unsigned char actual_c = (unsigned char)c;
    // ...
}

In fact, the most prominent C standard library implementations that I know do exactly this:

  • GNU libc: passed to memchr which does unsigned char c = (unsigned int)c_in;
  • BSD libc: unsigned char uc = c;
  • Bionic (Android): borrowed from BSD unsigned char uc = c;
  • musl libc: reassignment with cast c = (unsigned char)c;
  • uClibc: cast at comparison: (((unsigned char)(*r1 = *r2 )) != ((unsigned char) c))

CodePudding user response:

This is an older function which is similar to memset in terms of the argument it accepts:

void *memset(void *s, int c, size_t n);

It is described in the C standard as follows:

The memset function copies the value of c (converted to an unsigned char) into each of the first n characters of the object pointed to by s.

Both functions date back to at least 4.3 BSD, so it would make sense that they handle their arguments in a similar way.

So given your example, the value 0xffffff76 would be converted to the unsigned char value 0x76, and that would be the value it check for to stop.

  •  Tags:  
  • c
  • Related