Home > Mobile >  Share portion/area of mmap'ed memory
Share portion/area of mmap'ed memory

Time:08-11

I have several processes which performed mmap() of a specific size (0x8000). I would like to share only a portion of this memory space between these processes as shown in the following diagram:

      0x0             0x2000-0x3000           0x8000
p1:   [MEM. PRIVATE]  [MEM. SHARING]  [MEM. PRIVATE]
p2:   [MEM. PRIVATE]  [MEM. SHARING]  [MEM. PRIVATE]

In this scenario, the memory allocated by mmap() must be only shared between the range 0x2000-0x3000. Other portions are private (MEM. PRIVATE).

Is there a system call to perform the sharing after the call to mmap()? I tried with shm_open() beforehand but the entire range is shared.

CodePudding user response:

Sounds like you want 3 mappings, with only the middle one being MAP_SHARED.

But instead of 3 separate mmap calls, make two. First a MAP_PRIVATE of the whole length, then a MAP_FIXED|MAP_SHARED of the shared region at offset 0x2000, replacing the middle of the first mapping. The first mapping could be MAP_ANONYMOUS, not backed by the file at all, since you're not sharing it.

If it should be zero-initialized, you don't need a disk file or part of an shm region for that. Being private, writes to it won't persist anywhere, but could start with a non-zero initializer if you write data to the file some other way.

With MAP_FIXED, the first arg to mmap isn't just a hint, it's definitely where your mapping will be, even if that means replacing (part of) a previous mapping at that virtual address as if by munmap.

void *setup_mapping(int fd)
{
   const size_t total_len = 0x8000;
   const size_t shared_mem_offset = 0x2000, shared_len = 0x1000;
   const size_t shared_file_offset = 0;  // the file *only* holds the shared part
  // unless you have some non-zero data for the private mapping to start with

   // Private anonymous non-shared mapping, like malloc.
   // Letting the kernel pick an address
   char *base = mmap(NULL, total_len, PROT_READ | PROT_WRITE, 
                      MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
               // or without MAP_ANONYMOUS, using fd, 0 
   if (base == MAP_FAILED)
       return MAP_FAILED;

  // replace the middle of that mapping with our shared file mapping
   void *shared = mmap(base shared_mem_offset, shared_len, PROT_READ|PROT_WRITE, 
                      MAP_SHARED|MAP_FIXED, fd, shared_file_offset);

   if (shared == MAP_FAILED){
       munmap(base, total_len);   // if munmap returns error, still just return
       return MAP_FAILED;
   }

   return base;
}

I used char *base so math on it would be well-defined. GNU C does define pointer math on void* as working like on char*.

BTW, a single munmap can tear down the whole region when you're done, regardless of it being 2 or 3 mappings. Or of course just exit the process.


Starting with an mmap of the full length of your total region means you don't need to look for a free range of virtual addresses to use as hints; the first mmap will find and claim one so you can just safely MAP_FIXED over part of it. This is safe even if other threads are allocating memory at the same time; they can't randomly pick a conflicting address between two mmap calls like they could if you did 3 separate mappings with just hint addresses or with MAP_FIXED_NOREPLACE. It's unlikely that it would be a problem in practice.

Also, this would only make 2 system calls instead of 3 so it's more efficient. (The internal work of mapping more pages, and of replacing a mapping, should be minor, especially when you haven't touched that memory yet to actually fault it in.)

It's also less code to write, and fewer error return values to check, thus fewer corner cases for partial failure.

CodePudding user response:

You need three mappings:

  1. MAP_PRIVATE
  2. MAP_SHARED
  3. MAP_PRIVATE

The trick is to use a different value for the addr argument to mmap for each one. The first is NULL, then for each remaining, use the old return value the segment size (i.e. for the second two, use an explicit map address).


Here is the code. It is annotated:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <sys/mman.h>
#include <sys/wait.h>

#define P1OFF       0x0000
#define P1LEN       0x1000

#define SHROFF      0x2000
#define SHRLEN      0x1000

#define P2OFF       (SHROFF   SHRLEN)
#define P2LEN       (0x8000 - SHROFF)

const char *shrfile = "share.mem";

typedef unsigned char u8;
typedef unsigned int u32;

typedef struct {
    int seg_idx;                        // segment index
    u32 seg_off;                        // segment offset
    u32 seg_len;                        // segment length
    void *seg_base;                     // segment pointer
    int seg_shr;                        // 1=segment is shared
    int seg_fd;                         // segment fd (unused)
    char seg_file[100];                 // file for given segment (unused)
} seg_t;

seg_t seglist[] = {
    [0] = {
        .seg_idx = 0,
        .seg_off = P1OFF,
        .seg_len = P1LEN,
        .seg_file = "p1_%d.mem",
    },
    [1] = {
        .seg_idx = 1,
        .seg_off = SHROFF,
        .seg_len = SHRLEN,
        .seg_shr = 1,
        .seg_file = "share.mem",
    },
    [2] = {
        .seg_idx = 2,
        .seg_off = P2OFF,
        .seg_len = P2LEN,
        .seg_file = "p2_%d.mem",
    },
};

#define SEGALL \
    seg = &seglist[0]; \
    seg < &seglist[sizeof(seglist) / sizeof(seglist[0])]; \
      seg

#define prt(_fmt...) \
    do { \
        prtx(); \
        printf(_fmt); \
    } while (0)

#define sysfault(_fmt...) \
    do { \
        prt(_fmt); \
        exit(1); \
    } while (0)

int pidxid;                             // processes's unique sequential ID
int fdshr;                              // file descriptor for buffer

u8 *shrbase;                            // pointer to shared segment
u8 *allbase;                            // pointer to full mapped area
u32 totlen;                             // length of area

double tsczero;

double
tscgetf(void)
{
    struct timespec ts;
    double sec;

    clock_gettime(CLOCK_MONOTONIC,&ts);

    sec = ts.tv_nsec;
    sec /= 1e9;
    sec  = ts.tv_sec;

    sec -= tsczero;

    return sec;
}

void
prtx(void)
{

    printf("[%.9f %d] ",tscgetf(),pidxid);
}

void
mkfile(const char *shrfile,u32 len)
{

    prt("mkfile: shrfile='%s' len=%4.4X\n",shrfile,len);

    int fd = open(shrfile,O_RDWR | O_TRUNC | O_CREAT,0644);
    if (fd < 0)
        sysfault("mkfile: open fault shrfile='%s' -- %s\n",
            shrfile,strerror(errno));

    int err = ftruncate(fd,len);
    if (err < 0)
        sysfault("mkfile: open fault shrfile='%s' -- %s\n",
            shrfile,strerror(errno));

    char buf[1];
    buf[0] = 0;

    for (u32 off = 0;  off < len;    off)
        write(fd,buf,1);

    close(fd);
}

void
mapall(void)
{
    u32 curoff;
    seg_t *seg;
    void *oldbase = NULL;

    fdshr = open(shrfile,O_RDWR);

    shrbase = NULL;
    allbase = NULL;

    for (SEGALL) {
        errno = 0;

        seg->seg_base = mmap(oldbase,seg->seg_len,PROT_READ | PROT_WRITE,
            seg->seg_shr ? MAP_SHARED : MAP_PRIVATE,fdshr,seg->seg_off);

        prt("dochild: MAP seg_idx=%d oldbase=%p seg_base=%p seg_len=%4.4X -- %s\n",
            seg->seg_idx,oldbase,seg->seg_base,seg->seg_len,strerror(errno));

        if (seg->seg_base == MAP_FAILED)
            sysfault("dochild: FAIL\n");

        oldbase = seg->seg_base   seg->seg_len;

        // remember the shared address
        if (seg->seg_shr)
            shrbase = seg->seg_base;

        // remember the start address
        if (allbase == NULL)
            allbase = seg->seg_base;
    }

    if (shrbase == NULL)
        sysfault("mapall: null shrbase\n");

    // single main that our setup is done
    *shrbase = 1;
}

void
dochild(int idx)
{
    seg_t *seg;

    pidxid = idx;

    mapall();

    sleep(2);

    // fill the _entire_ buffer with our ID
    prt("dochild: FILL\n");
    memset(allbase   1,pidxid   1,totlen - 1);

    prt("dochild: UNMAP\n");
    for (SEGALL)
        munmap(seg->seg_base,seg->seg_len);

    close(fdshr);
}

int
main(void)
{
    pid_t pid;
    seg_t *seg;

    tsczero = tscgetf();

    setlinebuf(stdout);
    pidxid = 0;

    // get total size
    totlen = 0;
    for (SEGALL)
        totlen  = seg->seg_len;

    // create the file
    mkfile(shrfile,totlen);

    // point to shared segment
    seg = &seglist[1];

    for (int idx = 0;  idx <= 3;    idx) {
        if (idx == 0) {
            mapall();
            continue;
        }

        *shrbase = 0;

        pid = fork();

        if (pid == 0) {
            dochild(idx);
            continue;
        }

        // wait for child to complete setup
        while (*shrbase == 0)
            usleep(100);
    }

    while (1) {
        pid = wait(NULL);
        if (pid < 0)
            break;
        prt("main: reaped\n");
    }

    return 0;
}

Here is the program output:

[0.000008308 0] mkfile: shrfile='share.mem' len=8000
[0.027699569 0] dochild: MAP seg_idx=0 oldbase=(nil) seg_base=0x7f8d7d41b000 seg_len=1000 -- Success
[0.027718086 0] dochild: MAP seg_idx=1 oldbase=0x7f8d7d41c000 seg_base=0x7f8d7d41a000 seg_len=1000 -- Success
[0.027725139 0] dochild: MAP seg_idx=2 oldbase=0x7f8d7d41b000 seg_base=0x7f8d7d414000 seg_len=6000 -- Success
[0.027908063 1] dochild: MAP seg_idx=0 oldbase=(nil) seg_base=0x7f8d7d413000 seg_len=1000 -- Success
[0.027958351 1] dochild: MAP seg_idx=1 oldbase=0x7f8d7d414000 seg_base=0x7f8d7d412000 seg_len=1000 -- Success
[0.027965133 1] dochild: MAP seg_idx=2 oldbase=0x7f8d7d413000 seg_base=0x7f8d7d40c000 seg_len=6000 -- Success
[0.028166477 2] dochild: MAP seg_idx=0 oldbase=(nil) seg_base=0x7f8d7d413000 seg_len=1000 -- Success
[0.028225279 2] dochild: MAP seg_idx=1 oldbase=0x7f8d7d414000 seg_base=0x7f8d7d412000 seg_len=1000 -- Success
[0.028235211 2] dochild: MAP seg_idx=2 oldbase=0x7f8d7d413000 seg_base=0x7f8d7d40c000 seg_len=6000 -- Success
[0.028478286 3] dochild: MAP seg_idx=0 oldbase=(nil) seg_base=0x7f8d7d413000 seg_len=1000 -- Success
[0.028522598 3] dochild: MAP seg_idx=1 oldbase=0x7f8d7d414000 seg_base=0x7f8d7d412000 seg_len=1000 -- Success
[0.028529647 3] dochild: MAP seg_idx=2 oldbase=0x7f8d7d413000 seg_base=0x7f8d7d40c000 seg_len=6000 -- Success
[2.028197225 1] dochild: FILL
[2.028475924 2] dochild: FILL
[2.028713112 3] dochild: FILL
[2.355625517 0] main: reaped
[2.355667716 0] main: reaped
[2.356288144 0] main: reaped

Here is a dump of the file. Note that the "private" areas remain 0 even though the child processes write their own non-zero ID there.

00000000: 00000000 00000000 00000000 00000000  ................
*
00002000: 01000000 00000000 00000000 00000000  ................
00002010: 00000000 00000000 00000000 00000000  ................
*
00002fc0: 04040404 04040404 04040404 04040404  ................
00002fd0: 04040404 04040404 04040404 04040404  ................
00002fe0: 04040404 04040404 04040404 04040404  ................
00002ff0: 04040404 04040404 04040404 04040404  ................
00003000: 00000000 00000000 00000000 00000000  ................
*
00007ff0: 00000000 00000000 00000000 00000000  ................
  • Related