Protect /dev/shm file-CodePudding

I'm working on an application which is using a shared memory via shm_open(). It perform mmap() from a file within /dev/shm and is based on producer/consumer approach.

Is there any mechanism for my shared memory to be protected and accessible only by this application? I know it is possible to use encryption but does linux (or the programming language) provide any services so that the file is only accessible by my application?

CodePudding user response：

If you use fd = shm_open(name, O_RDWR | O_CREAT | O_EXCL, 0);, then the shared memory object cannot be opened by any other process (without changing the access mode first). If it succeeds (fd != -1), and you immediately unlink the object via int rc = shm_unlink(name); successfully (rc == 0), only processes that can access the current process itself can access the object.

There is a small time window between the two operations when another process with sufficient privileges might have changed the mode and opened the object. To check, use fcntl(fd, F_SETLEASE, F_WRLCK) to obtain a write lease on the object. It will succeed only if this is the only process with access to the object.

Have the first instance of the application bind to a previously-agreed Unix domain stream socket, named or abstract, and listen for incoming connections on it. (For security reasons, it is important to use fcntl(sockfd, F_SETFD, FD_CLOEXEC) to avoid leaking the socket to a child process in case it exec()s a new binary.)

If the socket has been already bound, the bind will fail; so connect to that socket instead. When the first instance accepts a new connection, or the second instance connects to i, both must use int rc = getsockopt(connfd, SOL_SOCKET, SO_PEERCRED, &creds, &credslen); with struct ucred creds; socklen_t credslen = sizeof creds;, to obtain the credentials of the other side.

You can then check that the uid of the other side matches getuid() and geteuid(), and verify using e.g. stat() that the path "/proc/PID/exe" (where PID is the pid of the other side) refers to the same inode on the same filesystem as "/proc/self/exe". If they do, both sides are executing the same binary. (Note that you can also use POSIX realtime signals, via sigqueue(), passing one data token (of int, void pointer, or uintptr_t/intptr_t which happen to match unsigned long/long on Linux) between them.) This is useful, for example if one wants to notify the other that they're about to exit, and the other one should bind to and listen for incoming connections on the Unix domain stream socket.)

Then, the initial process can pass a copy of the shared object description (via descriptor fd) to the second process, using an SCM_RIGHTS ancillary message, with for example the actual size of the shared object as data (recommend a size_t for this). If you want to pass other stuff, use a structure.

The first (often, but not necessarily only) message the second process receives will contain the ancillary data with a new file descriptor referring to the shared object. Note that because this is an Unix domain stream socket, message boundaries are not preserved, and if there wasn't a full data payload, you need to use a loop to read the rest of the data.

Both sides can then close the Unix domain socket. The second side can then mmap() the shared object.

If there is never more than this exact pair of processes sharing data, then both sides can close the descriptor, making it impossible for anyone except superuser or the kernel to access the shared descriptor. The kernel will keep an internal reference as long as the mapping exists; it is equivalent to the process having the descriptor still open, except that the process itself cannot access or share the descriptor anymore, only the shared memory itself.

Because the shared object has been unlinked already, no cleanup is necessary. The shared object will vanish as soon as the last process with an open descriptor or existing mmap closes it, unmaps it, or exits.

The Unix security model that Linux implements does not have strong boundaries between processes running as the same uid. In particular, they can examine each others /proc/PID/ pseudodirectories, including their open file descriptors listed under /proc/PID/fd/.

Because of this, security-sensitive applications usually run as a dedicated user. The aforementioned scheme works well even when the second party is a process running as the human user, and the first party as the dedicated application uid. If you use a named Unix domain stream socket, you do need to ensure its access mode is suitable (you can use chmod(), chgrp(), et al. after binding to the socket, to change the named Unix domain stream socket access mode). Abstract Unix domain stream sockets do not have a filesystem-visible node, and any process can connect to such a bound socket.

When a privilege boundary is involved between the application (running as its own dedicated uid) and the agent (running as an user uid), it is important to make sure that both sides are who they claim to be across the entire exchange. The credentials are valid only at that point in time, and a known attack method is to have the valid agent execute a nefarious binary just after having connected to the socket, so that the other side still sees the original credentials, but the next communications are in control of a nefarious process. To avoid this, make sure the socket descriptor is not shared across an exec (using CLOEXEC descriptor flag), and optionally check the peer credentials more than once, for example initially and finally.

Why is this "complicated"? Because proper security has to be baked in, it cannot be added on top afterwards, or taken invisibly care of for you: it must be a part of the approach. Changes in the approach must be reflected in the security implementation, or you have no security.

In real life, after you implement this (for the same-executable-binary one, and the privileged-service-or-application and user-agent one), you'll find that it isn't as complicated as it sounds: each step has their purpose, and can be tweaked if the approach changes. In particular, it isn't much C code at all. If one wants or needs "something easier", then one just has to pick something other than security-sensitive code.