Home > OS >  Portable C Implementation of Relative Pointers
Portable C Implementation of Relative Pointers

Time:10-15

I define relative pointer to mean what Ginger Bill describes as Self-Relative Pointers:

... define the base [to which an offset will be applied] to be the memory address of the offset itself

For example, consider this struct:

struct house {
  int32_t weight;
}
struct person {
  int32_t age;
  struct house* residence;
}
int32_t getPersonsHousesWeight(struct person* p) {
  return p->residence->weight;
}

The relative-pointer implementation of the same thing in C that I think might work is:

struct house { ... } // same as before
struct person {
  int32_t age;
  int64_t residence; // an offset from the person's address in memory
}
int32_t getPersonsHousesWeight(struct person* p) {
  return ((struct residence*)((char*)p   (p->residence)))->weight;
}

Assuming that alignment of everything is good (all 8 bytes), is this free of undefined behavior?

EDIT

@tstanisl has provided an excellent answer (which I've accepted) that thoroughly explains UB in the context of stack allocations. I am curious how allocation into a large slab of contiguous heap would impact this analysis. For example:

int foo(void) {
  char* base = mmap(NULL,4096,PROT_WRITE | PROT_READ,-1,MAP_PRIVATE | MAP_ANONYMOUS);
  // Omitting mmap error checking
  struct person* myPerson = (struct person*)(base   128);
  struct house* myHouse = (struct house*)(base   256);
  int32_t delta = (char*)myHouse - (char*)myPerson;
  // Does the computation of delta invoke UB?
}

CodePudding user response:

Usually it is going to be UB. The first case is when person and house belong to separate object. In such a case it will be UB because the pointer arithmetics is performed outside of the object.

int foo(void) {
  struct person p;
  struct house h;
  p.residence = (char*)&h - (char*)&p; // already UB
  getPersonsHousesWeight(&p); // UB again
}

In practice it means that the compiler is not obligated to notice that objects accessed from a pointers constructed from &p can alias with object h because p and h are separete memory regions (aka objects).

When both objects are placed inside a larger object then the situation is a bit better. Though it still would be technical UB.

int foo(void) {
  struct ph {
    struct person p;
    struct house h;
  } ph;
  ph.p.residence = (char*)&ph.h - (char*)&ph.p; // still UB
  getPersonsHousesWeight(&ph.p); // UB again
}

It UB because pointer arithmetic is done outside the member object. (char*)&ph.h - 1 is a pointer outside of ph.h.

Note, that this code will likely work pretty much everywhere. Otherwise, heavily used container_of-like macros would not work breaking a lot of existing code including the Linux kernel.

To avoid UB the pointer must be constructed in a special way to avoid moving outside of the originating object. Rather using &ph.h one should use (char*)&ph offsetof(struct ph, h). Similarly &ph.p should be replaced with (char*)&ph offsetof(struct ph, p).

Now this code should be portable:

int foo(void) {
  struct ph {
    struct person p;
    struct house h;
  } ph;
  struct person *p_ptr = (struct person*)((char*)&ph   offsetof(struct ph, p));
  struct house  *h_ptr = (struct house*) ((char*)&ph   offsetof(struct ph, h));
  ph.p.residence = (char*)h_ptr - (char*)p_ptr;
  getPersonsHousesWeight(p_ptr);
}

Though it is very obscure. The interesting discussion on this topic can be found at link

  •  Tags:  
  • c
  • Related