From: ben@VALINUX.COM Subject: execve bug linux-2.2.12 While doing some debugging, I discovered a really nasty stack smash bug in linux-2.2.12. The I haven't checked previous versions of the 2.2 kernel but bug appears to be fixed in linux-2.2.13pre17. If I am reading this correctly, the implications of this bug could be very dire. It may be possible to easily obtain root privilege on any box running this kernel. Basically the problem is that the execve system call checks that argv is a valid pointer but it doesn't check that all of the pointers in argv array are valid pointers. If you pass bad pointers into the execve system call you can corrupt the processes stack before it returns to user space. Then when the kernel hands off the process to the elf loader code and which begins to setup the processes it can be made to execute some malicious code in place of the program's main function. This is particularly scary because all of this occurs BEFORE the program begins executing its main function and AFTER the program returns to user space with privilege. Therefore no matter how well audited the program may be it can be used as to gain privilege. The thing that tipped me off to the problem was that a program that I exec'd was getting killed with SIGSEGV in __libc_start_main before my main function began running. -ben Per popular demand here is some more information on the bug I've been observing. I'm sorry. I wish I had thought to include this in my original post: Here is one ltrace fragment where my program only corrupts one of the parameters: [pid 578] execv("/bin/grep", 0x7ffffcdc <unfinished ...> [pid 578] __libc_start_main(0x0804a4e0, 200, 0x7fffb3a4, 0x08048bf4, 0x080516dc <unfinished ...> [pid 578] --- SIGSEGV (Segmentation fault) --- [pid 578] +++ killed by SIGSEGV +++ --- SIGCHLD (Child exited) --- Here is some information from gdb: (gdb) core-file /tmp/core Core was generated by Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libc.so.6...done. Reading symbols from /lib/ld-linux.so.2...done. #0 0x2aae60f6 in getenv (name=0x2aba8562 "LLOC_TRIM_THRESHOLD_") at ../sysdeps/generic/getenv.c:88 ../sysdeps/generic/getenv.c:88: No such file or directory. (gdb) bt #0 0x2aae60f6 in getenv (name=0x2aba8562 "LLOC_TRIM_THRESHOLD_") at ../sysdeps/generic/getenv.c:88 #1 0x2aae689b in __secure_getenv (name=0x2aba8560 "MALLOC_TRIM_THRESHOLD_") at secure-getenv.c:29 #2 0x2ab1e2e0 in ptmalloc_init () at malloc.c:1689 #3 0x2aade211 in __libc_preinit (argc=200, argv=0x7fffb3a4, envp=0x7fffb6c8) at set-init.c:26 #4 0x2aade030 in __libc_start_main (main=0x804a4e0 <strcpy+5500>, argc=200, argv=0x7fffb3a4, init=0x8048bf4, fini=0x80516dc <strcpy+34680>, rtld_fini=0x2aab5ad4 <_dl_fini>, stack_end=0x7fffb39c) at ../sysdeps/generic/libc-start.c:68 (gdb) This was just one run. There were other runs where more interesting things happened. There was one in particular where the pointer to init was corrupted but I haven't been able to reproduce that one yet. I put the source code for the program I was debugging at the time when I stumbled into this at: "ftp://ftp.bastille-linux.org/bastille/broken-fuzz.c.gz". Note: this is not a working program!!! Do not take this as a release. I have since fixed many bugs in it. I coded it up and was in the process of making it work for the first time when I stumbled across this problem. Its its current form its only purpose is to demonstrate the problem that I saw. To trigger the problem simply run the program with the -ba option and the name of your favorite exectuable. e.g. "./fuzz -ba grep" -ben To: BugTraq Subject: Re: execve bug linux-2.2.12 Date: Fri Oct 15 1999 19:20:14 Author: visi0n Whoa, I think the kernel 2.0.38 has the same bug, and one more, in the count() function to check how many argv's the bin have, he dont check for max number of argv's. This is worse than the bug found in 2.2.12 execve(). To: BugTraq Subject: Re: execve bug linux-2.2.12 Date: Sat Oct 16 1999 07:22:02 Author: Alan Cox > Basically the problem is that the execve system call checks that argv > is a valid pointer but it doesn't check that all of the pointers in > argv array are valid pointers. If you pass bad pointers into the This is incorrect. To start with - it builds the argv pointer array itself. The passed array is simply used to get a list of strings and to build them on the stack of the target process. The argv and envp is then built by the ELF loader walking these tables in order to generate the argv and envp arrays that the SYS5 ABI expects to be passed (saner ABI's the user space start up builds argc/argv). > execve system call you can corrupt the processes stack before it > returns to user space. Then when the kernel hands off the process to I don't think you can. The built ELF stack looks roughly like [Environment] - null terminated string data [Arguments] - null terminated string data [Elf gloop] [envp] [argv] [argc] -> You are here on entry, so the stack is fine. > The thing that tipped me off to the problem was that a program that I > exec'd was getting killed with SIGSEGV in __libc_start_main before my > main function began running. I would certainly be interested in an example that caused this. That there could be a bug in the kernel or glibc exec building I can believe. Your diagnosis of the cause however is dubious. Alan To: BugTraq Subject: Re: execve bug linux-2.2.12 Date: Sat Oct 16 1999 14:13:19 Author: security@xirr.com Caveat: I am running linux-2.2.12ow6 which contains many security fixes, yet I believe my comments are still valid. Also I am not a kernel guru. > Basically the problem is that the execve system call > checks that argv is a valid pointer but it doesn't check > that all of the pointers in argv array are valid pointers. The kernel copies each argv[i] into a contiguous chunk of the (soon to be) stack. Thus it must dereference each argv[i]. Check out linux/fs/exec.c line 261 for an almost explicit dereference of argv[i] (memcpy(str,argv+i) except kernel to user space version). This is confirmed by a small test program: #include "nolibc.h" main(int argc, char** argv,char **envp) { int i; char buf[32]; argv[1]=2; i=execve("/bin/sh",argv,envp); /* we should never reach this point, but print out errno in hexadecimal */ i=htonl(i); i=itoh(&i,buf); buf[i]='\n'; write(1,buf,i+1); } This program does not run /bin/sh but istead prints out the message 0000000e representing errno=14, EFAULT. This means the kernel got a segfault while copying the argv[i]'s to the stack, and thus failed the syscall. This program is linked with 'gcc -O -fno-builtin -nostdlib test.c' nolibc.h is ugly but available by request under GPL. It defines ntohl,itoh,write,execve, and _start. Note execve, htonl, itoh, and write are macros. Execve/write are direct system calls. (itoh converts 4 bytes to 8byte hex representation and returns 8, htonl byte swaps so the bytes come out in the right order). > The thing that tipped me off to the problem was that a > program that I exec'd was getting killed with SIGSEGV > in __libc_start_main before my > main function began running. I'm not really sure if this is a widespread problem, but ANYTIME libc gets hosed (malloc(-1) for example) gdb reports the problem occuring in a function called from __libc_start_main and does not ever mention main. I'll study this a wee bit more, since the references I'm using for the startup state don't seem to jive with my experience. (Namely I never see an array of pointers being setup in the docs, and my programs definately do not do so, yet they function and dereference argv as if it were an array of pointers). Another remark: If I misunderstood the bug (like argv[1]=2 obviously is not valid, and is not what you meant) please let me know. Author: Matt Chapman On Sat, Oct 16, 1999 at 02:22:02PM +0100, Alan Cox wrote: > > I would certainly be interested in an example that caused this. #include #include #define BADPTR (char *)0x10 /* for example */ int main(int argc, char **argv, char **envp) { char *args[7]; int i; args[0] = "su"; for (i = 1; i < 6; i++) { args[i] = BADPTR; } args[6] = NULL; execve("/bin/su", args, envp); printf("%s\n", strerror(errno)); return 1; } This program (on my system at least 5 bad arguments are needed) reproducibly dies with SIGSEGV on 2.2.12. A similarly configured system with kernel 2.0.36 correctly reports EFAULT. This would not normally be a problem, however... the above program will not dump core for an ordinary user, only root, which makes me believe that the fault occurs after the process has gained the root euid from /bin/su. A gdb trace suggests the usual heap corruption in glibc, which does not seem to be related to the arguments passed to execve (as long as they are bad), so I doubt this is exploitable. However it is most likely a bug somewhere. Matt -- Matthew "Austin" Chapman SysAdmin, Developer, Samba Team Member