Linux sandbox
I am trying to build an evaluation system for algorithmic problems. The user will have to solve a problem which receives its input data through standard input and prints a solution to standard out.
The goal is to solve the given problem within some predefined memory and time constraints.
When a user thinks he solved a problem, he submits it for evaluation.
There are two problems when evaluating users source code:
- Execution of the compiled source code must be done in an isolated way (jailing the application), such that the executable cant harm the host (operating system that is running user process). There are several ways how the process could harm the host: use all available memory or disk space, exceed maximum number of threads or processes on the system (or other kernel structures), run for a very long time (infinite loop), access files it isnt supposed to (maybe a solution file).
- Measuring the resources that the application used during its execution, such as time and memory (heap, stack).
This post will concentrate on the approaches to solve the first step - making a jail.
- Monitor every system call a process makes using the ptrace() system call. This enables the process called tracer to be notified before and after every system call his tracee makes. At this point the tracer can decide whether to allow it or not.
Problem is how to decide which system calls are allowed, and this list changes during the tracees execution. Beginning sequence of system calls contains dangerous system calls that must be disabled when the process is completely loaded into the memory.
Ptrace() also enables inspecting every signal that is received by the tracee, and is able to read tracees memory. These two features will probably not be used in this approach.
This approach works well for languages that are translated to machine code, because they are executed directly, but Java and Python (and many others) are problematic because they are executed through a different executable (virtual machine or interpreter). This means that these executables would have to be traced, and this is not possible, because they often do forbidden things, like reading files, creating multiple threads. We wouldnt be able to determine if the users code did this, or virtual machine by itself.
There are also workarounds for that problem, like compiling java with gcj to machine code or python with various python to C++ compilers (nuitka, cython).
Thus, the main concern is difficulties with adding support for another language. - Try putting the application in a container that has no means of interacting with the rest of the system, which is accomplished using different namespaces for process IDs, network, IPC structures (semaphores, shared memory...), file system mount, UTS (system name - uname), user and group ids. If a process cant reference anything outside of the jail than it isnt able to do any harm.
Cgroups are way of limiting resources of a process group.
These two mechanisms are used by various tools that implement application jailing, one of these tools is lxc.
The approach with the lxc is more flexible, because it doesnt limit applications inside the container in any way, since they cant do any harm to the outside world, maybe only crash the jail they are in. If the applications arent limited, then they can be executed using a support program, like java virtual machine or python interpreter, which wasnt possible in the first approach.
The second approach using namespace separation and cgroups is a better way to go, because it is more extensible than the first approach using ptrace, primarily in terms of adding support for another language.
The Moe contest environment is an already existing solution to this problem. It has to modes of isolation. The old one is ptrace, and the new one is using linux namespaces. Lxc was built on top of the linux namespaces and that makes these two approaches similar.
download file now
alternative link download