I'm writing a coding competition grader, in which I want to use gcc
to compile a contestant's code and link it with only a restricted subset of C standard library functions. For instance, I only want the contestants to be able to use functions from stdlib.h
, string.h
, and a handful of other stdlib header files, but not able to e.g. include sys/sysinfo.h
, which could potentially allow them to do nefarious things.
I'm wondering if there's a way to pass in a flag, or configure ld
to do so? My current idea is to play around with ld
to only make it link selectively against a folder of static libraries containing the libc implementations I want.
CodePudding user response:
It is not the standard library functions that are a potential source of problems, it is the syscalls those functions do. (Indeed, what would stop a nefarious submitter from including an extended assembly function to do call those syscalls directly, and avoid your limitations? Nothing. If you use GCC or clang, you cannot even disable extended assembly support.)
What you can do, without much difficulty, is to implement a seccomp filter that only allows syscalls you consider safe. On x86-64, your filter does need to deal with possible 32-bit syscalls also (for example, by extended assembly functions); I'd personally only allow a basic set of 64-bit syscalls on x86-64 (so do check the architecture number first in the filter), perhaps just exit
and exit_group
for ending the process normally, read
/readv
/preadv2
for reading from an open file descriptor, and write
/writev
/pwritev2
for writing to an open file descriptor.
I would compile the submitted code into an object file, and check using objdump -t
that it does not contain an .init_array
symbol (ELF constructor functions, functions using GCC/clang __attribute__((__constructor__))
attribute, so that they're run prior to main()
), but does contain main
symbol.
If that passes, then I would combine the submitted code in the object file with an object file that contains a suitable ELF constructor function setting up the secure computing environment. Using GCC or Clang, it would be something like the following:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stddef.h>
#include <unistd.h>
#include <linux/seccomp.h>
#include <linux/filter.h>
#include <linux/audit.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <errno.h>
#define BPF_SYSCALL_NR (offsetof (struct seccomp_data, nr))
#define BPF_ARCH_ID (offsetof (struct seccomp_data, arch))
#if defined(__amd64__) || defined(__x86_64__)
#define ALLOW_ARCH_ID AUDIT_ARCH_X86_64
#elif defined(__i386__)
#define ALLOW_ARCH_ID AUDIT_ARCH_I386
#else
#error Unsupported architecture.
#endif
__attribute__((__constructor__))
static void setup_seccomp_filter(void)
{
struct sock_filter filter[] = {
/* Only allow syscalls using the specified architecture. */
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, BPF_ARCH_ID),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, ALLOW_ARCH_ID, 1, 0),
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL_PROCESS),
/* Only allow specific syscalls. */
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, BPF_SYSCALL_NR),
/* Allow reading from an open file descriptor. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_read, 18, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_readv, 17, 0),
/* Allow writing to an open file descriptor. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_write, 16, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_writev, 15, 0),
/* Allow obtaining open file descriptor information. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_fstat, 14, 0),
/* Allow seeking. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_lseek, 13, 0),
/* Allow memory allocation.
NOTE: mmap should really check PROT and FLAGS, and
mremap should really check flags. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_brk, 12, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_mmap, 11, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_mremap, 10, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_munmap, 9, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_madvise, 8, 0),
/* Allow POSIX clock access and nanosleep. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_clock_getres, 7, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_clock_gettime, 6, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_clock_nanosleep, 5, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_gettimeofday, 4, 0),
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_nanosleep, 3, 0),
/* Allow syscall restart (used by the C library). */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_restart_syscall, 2, 0),
/* Allow program to exit normally. */
BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, __NR_exit_group, 1, 0),
/* Deny all other syscalls with ENOSYS. */
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ERRNO | (SECCOMP_RET_DATA & ENOSYS)),
/* Allow syscall */
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW)
};
struct sock_fprog desc = {
.len = sizeof filter / sizeof filter[0],
.filter = filter,
};
/* If exec is ever allowed, never gain new privileges via exec. */
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
exit(98);
}
/* Install the filter. */
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &desc, 0, 0)) {
exit(97);
}
}
It would be better to append the filter with checks in case of mmap(), to verify prot is PROT_READ | PROT_WRITE
, and that flags is MAP_PRIVATE | MAP_ANONYMOUS
. Similarly, mremap() should only allow flags to be zero or MREMAP_MAYMOVE
. These would stop the submitted program from trying to allocate executable memory, and other similar tricks. In any case, even such tricks won't let the process to use any other syscalls except those explicitly allowed by the seccomp filter.
The BPF_JUMP(BPF_JMP | BPF_K | BPF_JEQ, value, trueskip, falseskip)
macro contains the value loaded by a previous BPF_STMT(BPF_LD,...)
to value
. If the two match, then the following trueskip
entries will be skipped. Otherwise, the following falseskip
entries will be skipped.
Maintenance of the seccomp filter in this form is very sensitive (especially the skip counts!), so one might prefer to build the filter automatically from a more human-friendly description. For a larger number of allowed syscalls, one might implement a binary search algorithm for speedup. In any case, I warmly recommend implementing an unit test case (that expects the ENOSYS
error from unsupported syscalls) that verifies all allowed syscalls, and tests a few of the ones you definitely don't want to support, and running the test case every time one modifies or even recompiles the filter.
CodePudding user response:
After compiling the contestant's code to an unlinked object module, use "objdump" to print its unresolved references. Check this against your list of allowed things (functions, variables, ...) with a small script.
You should read objdump's documentation, but the option "-r" might be a good start.