%deffont "thick" xfont "helvetica-bold-r", tfont "comicbd.ttf" %deffont "typewriter" xfont "courier-medium-r", tfont "comicbd.ttf" %deffont "standard" xfont "helvetica-medium-r", tfont "comic.ttf", tmfont "wadalab-gothic.ttf" %%include "default.mgp" %default 1 leftfill, size 2, fore "blue", back "white", font "thick" %default 2 size 4, vgap 10, prefix " ", center %default 3 size 2, bar "gray70", vgap 4, left %default 4 size 3, fore "blue", vgap 30, prefix " ", font "standard" %tab 1 size 3, vgap 40, prefix " ", icon arc "red" 40 %tab 2 size 3, vgap 40, prefix " ", icon delta3 "yellow" 40 %tab 3 size 3, vgap 40, prefix " ", icon delta3 "purple" 40 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page %nodefault %fore "red", size 3, font "standard", back "white" %%xfont "times" %center, fore "blue", font "standard", size 3 Advances in OpenBSD %center %size 3, fore "red" Theo de Raadt deraadt@openbsd.org The OpenBSD Project %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page The DARPA grant DARPA gave us a grant about 18 months ago, 6 months to go $2.3 million USD or so... We were going to do this security work anyway.. but this grant accelerates our efforts significantly DARPA does not tell us what to do, or get something extra Thanks to Jonathan M. Smith (upenn) and Angelos Keromytis (columbia) for negotiating this Pays for stuff 6 fulltime hackers... some hardware a 40+ person week-long hackathon a year Slaves to implement my ideas! %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Calgary "c2k2" Hackathon Had 42 developers in Calgary for a week last June Hacking space at downtown Marriott, funded by DARPA CHATS grant 55 people in my house for a Moose BBQ/Hack night VERY productive VERY little sleep Slogan: "Shut up and Hack" Next hackathon in Calgary in May: 60 developers coming %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Hacking [picture from hackathon removed] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Hacking (more) [picture from hackathon moose-k-bob bbq party removed] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page What have we done with the DARPA grant money? A subset of what we have done: Funded OpenSSL audit Setuid culling Privilege Seperation Privilege Revocation "Killing Buffer Overflows" Lots of packet filter gunk %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Funded OpenSSL audit Gave a part of our grant to some OpenSSL people In particular, via Ben Laurie and someone he contracted A few things found Some of the results hit BUGTRAQ a few months back More to be found Not a very efficient process they are using %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Setuid Culling We used to ship with approx 40 setuid binaries We now ship with about 8 of consequence that means "they still scare me a bit" chfn, login, passwd, rsh, su, sudo, lockspool, authpf Most of these are about 50 - 300 lines of code This was done by using more groups new user id's a fair amount of code rewrite We also managed to get rid of numerous setgid programs %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Users and Groups New passwd and group entries for revocation/seperation. We attempt to minimize uid/gid reuse. _portmap:*:28:28::0:0:portmap:/var/empty:/sbin/nologin _identd:*:29:29::0:0:identd:/var/empty:/sbin/nologin _rstatd:*:30:30::0:0:rpc.rstatd:/var/empty:/sbin/nologin _rusersd:*:32:32::0:0:rpc.rusersd:/var/empty:/sbin/nologin _fingerd:*:33:33::0:0:fingerd:/var/empty:/sbin/nologin _x11:*:35:35::0:0:X server:/var/empty:/sbin/nologin _spamd:*:62:62::0:0:Spam daemon:/var/empty:/sbin/nologin _portmap:*:28: _identd:*:29: _rstatd:*:30: _rusersd:*:32: _fingerd:*:33: _sshagnt:*:34: _x11:*:35: _lkm:*:61: _spamd:*:62: _radius:*:63: _token:*:64: _shadow:*:65: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Privilege Seperation Setuid programs or daemons Setup objects that require privilege SOCK_RAW, pty, reserved port Fork process Jail one process -- the complicated one Other process remains root, stays outside The two communicate over a socket Includes: sshd X server (and xdm) xconsole We are investigating seperation in httpd, ftpd, isakmpd %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Privilege Recovation Setuid programs or daemons Open some object that requires privilege read-only fd of utmp, reserved port, SOCK_RAW, special open of /dev/pf chroot if possible Revoke uid and gid Includes: ping, ping6, traceroute, traceroute6 write rwalld pppd spamd authpf portmap, rpc.rusersd, rpc.rstatd ftpd (improved) named (improved) httpd All these programs "jail themselves" %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page httpd changes By default, httpd jails inside in /var/www -u flag indicates "turn off this feature" Before self-jailing: modules are loaded Things which need write access (logs) /dev/*random, /dev/log Automatically translates paths inside configuration file Most people do NOT run cgi Most people do NOT run php For those who do not, this is a more secure choice. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page "Killing Buffer Overflows" By combining 5 technologies, we can make buffer overflows basically unexploitable A number of components work together stack-gap randomization ProPolice .rodata PROT_EXEC purity W ^ X %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Stack-gap Randomization Place a randomly sized gap on top of the stack More significant for non-forking processes "mod 1024" at the moment -- will be much more later Tiny waste of memory.. insignificant overhead %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Discussion on buffer overflow prevention issues What an attacker is looking for: stack overflows data segment overflows GOT/PLT overwrite jumping to data that he can execute 4 byte modification possibility 4 byte read possibility? perhaps can still find system code to jump to We want to disrupt as many of these as possible %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page ProPolice Stackguard on steroids gcc modification: machine-independent (well, kind of...) Function prologue puts random canary on stack next to return address Function epilogue checks random canary for modification Rearranges stack to put buffers closer to return address so flags and pointers are lower, harder to hit so overflows are more likely to hit canary If canary is modified, alert & kill process (Will not work on hppa yet) Very low overhead. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page .rodata Readonly strings and pointers were stored in .text segment But this is ELF, not a.out So move them out to their own segment These are no longer PROT_EXEC now Attacker cannot look for instruction sequences to use in static data. But.. we need PROT_EXEC to work. No overhead. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page PROT_EXEC purity POSIX has PROT_READ, PROT_WRITE, PROT_EXEC In most systems, PROT_READ implies PROT_EXEC BUT... we are permitted to do better Needs hardware support in the MMU! %pause Perfect: sparc (sun4m), sparc64, alpha, hppa, (hammer in 64 bit mode) %pause Impossible: m68k, vax, mips %pause Very difficult... later work: m88k %pause Horrific hacks: i386, powerpc No real overhead. Making these changes sped up some architectures %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page W ^ X Shorthand for: PROT_WRITE xor PROT_EXEC Goal: In regular process, ensure no page is mapped both PROT_EXEC and PROT_WRITE simultaneously Handle GOT/PLT exec on some... not on others.. ld.so must know about modifying those (and signal blocking) kernel must map ld.so more carefully Making this change took a lot of time Not violating POSIX %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page i386 PROT_EXEC best-effort i386 lacks per-page X bit Only significant relevant hardware feature: code segment limit 3.2/3.3: points below bottom of stack -> no-exec stack Link each shared object to have 1GB gap between code & data Map all text segments low, all data segments 1GB higher Set code segment limit register to point up to highest PROT_EXEC page (floating CS limit) Normally somewhere below 1GB Will be in 3.4: W^X on i386 Changing CS limit is a bit expensive: slight overhead %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page i386 PROT_EXEC real X bit Most x86 cpu use a split-TLB in their MMU already PTE has 3 unused bits: normally used by software Our kernel tracks X bit per-page using PT_AVAIL3 Used to determine where CS limit should be (highest PT_AVAIL3 page is included) If these cpu had a mode (PTX), could do X bit code fetch fault: If PTX && (pte & PT_AVAIL3) == 0 -> fault 150 gates, or approx 4 lines of microcode VIA and AMD are receptive, still trying to build contacts at Intel and Transmeta %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page powerpc PROT_EXEC best-effort powerpc lacks per-page X bit Hardware feature: 16 256MB segments, these have an X bit architecture has a near-access offset limit of 256MB (pic vs PIC) Solution: Make every 2nd segment X or not-X Link each shared object to have 256MB gap between code & data Map code & data segments from program, ld.so, and library over segment boundaries Result: W^X on powerpc Might be in 3.4 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Recap of Buffer Overflow Protection Attacker cannot overwrite return address; must guess 32 bit random number Attacker has no memory he can write to, which also permits execution Various data are no longer executable Stack, bss, and data segments are not executable GOT/PLT tables W^X: cannot be modified by attacker ESSENTIALLY NO PERFORMANCE LOSS. Complete coverage on good cpus; hacks for weak ones No violation of POSIX semantics %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Future work Randomized mapping for shared objects Try to find a way to map the main program randomly (ELF does not do so) Randomized mmap malloc & mmap that fragment slightly -> insert guard pages guard pages in the data segment? link all non-buffer objects near start of data segment, BEFORE buffers? Any other ideas? Note: We will not violate POSIX, so don't just say PAX and expect me to smile %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page Finally, the packet filter gunk New features: anchor sub-rules can be loaded by a program in-line table radix-tree list of address/netmask use almost anywhere you specify an address can be manipulated from program queue bandwidth control cut network bandwidth into a chunks pass & block rules can specify a queue %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %page 3.3 release & future plans Will be released May 1 Will contain most of the described changes, except i386 PROT_EXEC best effort powerpc PROT_EXEC best effort Simply not enough time to get those debugged... c2k3 hackathon in Calgary in May: 60 developers DARPA runs out soon, and once it does, donations or other grants would be really nice