Assemblyline

Usr6 · October 22, 2017

" Canada's electronic spy agency says it is taking the "unprecedented step" of releasing one of its own cyber defence tools to the public, in a bid to help companies and organizations better defend their computers and networks against malicious threats. " - http://www.cbc.ca/news/technology/cse-canada-cyber-spy-malware-assemblyline-open-source-1.4361728

Assemblyline

Assemblyline is a scalable distributed file analysis framework. It is designed to process millions of files per day but can also be installed on a single box.

An Assemblyline cluster consists of 3 types of boxes: Core, Datastore and Worker.

Components

Assemblyline Core

The Assemblyline Core server runs all the required components to receive/dispatch tasks to the different workers. It hosts the following processes:

Redis (Queue/Messaging)
FTP (proftpd: File transfer)
Dispatcher (Worker tasking and job completion)
Ingester (High volume task ingestion)
Expiry (Data deletion)
Alerter (Creates alerts when score threshold is met)
UI/API (NGINX, UWSGI, Flask, AngularJS)
Websocket (NGINX, Gunicorn, GEvent)

Assemblyline Datastore

Assemblyline uses Riak as its persistent data storage. Riak is a Key/Value pair datastore with SOLR integration for search. It is fully distributed and horizontally scalable.

Assemblyline Workers

Workers are responsible for processing the given files. Each worker has a hostagent process that starts the different services to be run on the current worker and makes sure that those service behave. The hostagent is also responsible for downloading and running virtual machines for services that are required to run inside of a virtual machine or that only run on Windows.

Assemblyline reference manual

If you want to know more about Assemblyline, you can get a copy of the full reference manual. It can also be found in the assemblyline/manuals directory of your installation.

Getting started

Use as an appliance

An appliance is a full deployment that's self contained on one box/vm. You can easily deploy an Assemblyline appliance by following the appliance creation documentation.

Install Appliance Documentation

Deploy a production cluster

If you want to scan a massive amount of files then you can deploy Assemblyline as a production cluster. Follow the cluster deployment documentation to do so.

Install Cluster Documentation

Development

You can help us out by creating new services, adding functionality to the infrastructure or fixing bugs that we currently have in the system.

You can follow this documentation to get started with development.

Setup your development desktop

Setting up your development desktop can be done in two easy steps:

Clone the Assemblyline repo
run the setup script

Clone repo

First, create your Assemblyline working directory:

export ASSEMBLYLINE_DIR=~/git/al
mkdir -p ${ASSEMBLYLINE_DIR}

Then clone the main Assemblyline repo:

cd $ASSEMBLYLINE_DIR
git clone https://bitbucket.org/cse-assemblyline/assemblyline.git -b prod_3.2

Clone other repos

${ASSEMBLYLINE_DIR}/assemblyline/al/run/setup_dev_environment.py

NOTE: The setup script will use the same git remote that you've used to clone the Assemblyline repo

Setup your development VM

After you're done setting up your Desktop, you can setup the VM from which you're going to run your personal Assemblyline instance.

Local VM

If you want to use a local VM make sure your desktop is powerful enough to run a VM with 2 cores and 8 GB of memory.

You can install the OS by following this doc: Install Ubuntu Server

(Alternative) Amazon AWS or other cloud providers

Alternatively you can use a cloud provider like Amazon AWS. We recommend 2 cores and 8 GB of ram for you Dev VM. In the case of AWS this is the equivalent to an m4.large EC2 node.

Whatever provider and VM size you use, make sure you have a VM with Ubuntu 14.04.3 installed.

Installing the assemblyline code on the dev VM

When you're done installing the OS on your VM, you need to install all Assemblyline components on that VM.

To do so, follow the documentation: Install a Development VM

Finishing setup

Now that the code is synced on your desktop and your Dev VM is installed, you should setup your development UI. Make sure to run the tweaks on your Dev VM to remove the id_rsa keys in order to have your desktop drive the code in your VM instead of the git repos.

If you have a copy of PyCharm Pro, you can use the remote python interpreter and remote deployment features to automatically sync code to your Dev VM. Alternatively, you can just manually rsync your code to your Dev VM every time you want to test your changes.

Setting up pycharm

Open PyCharm and open your project: ~/git/al (or ASSEMBLYLINE_DIR if you change the directory)

Pycharm will tell you there are unregistered git repos, click the 'add roots' button and add the unregistered repos.

Remote interpreter (pro only)

If you have the PyCharm Pro version you can set up the remote interpreter:

file -> settings
Project: al -> Project Interpreter

Cog -> Add Remote

SSH Credentials
host: ip/domain of your VM
user: al
authtype: pass or keypair if AWS
password: whatever password you picked in the create_deployment script

click ok

NOTE: Leave the settings page opened for remote deployments. At this point you should be done with your remote interpreter. Whenever you click the play or debug button it should run the code on the remote Dev VM.

Remote Deployment (PyCharm Pro only)

Still in the settings page:

Build, Execution, Deployment - > Deployment

Plus button
Name: assemblyline dev_vm
Type: SFTP

click OK

# In the connection tab
SFTP host: ip/domain of your VM
User name: al
authtype: pass or keypair if AWS
password: whatever password you picked in the create_deployment script

Click autodetect button

Switch to Mappings page
click "..." near Deployment path on server
choose pkg
click ok

NOTE: At this point you should be done with your remote deployment. When you make changes to your code, you can sync it to the remote Dev VM by opening the 'Version Control' tab at the bottom of the interface, selecting 'Local changes', right clicking on Default and selecting upload to 'assemblyline dev_vm'

Create a new service

To create a new service, follow the create service tutorial.

Create service tutorial

Link: https://bitbucket.org/cse-assemblyline/

gutui · October 25, 2017

Assemblyline is a malware detection and analysis tool developed by the CSE and released to the cybersecurity community in October 2017.

This tool was developed within CSE’s Cyber Defence program to detect and analyse malicious files as they are received. As the Government of Canada’s centre of excellence in cybersecurity, CSE protects and defends the computer networks and electronic information of greatest importance to the Government of Canada. Our highly skilled staff works every day to protect Canada and Canadians from the most advanced cyber threats. Assemblyline is one of the tools we use.

The release of Assemblyline is an opportunity for the cyber security community to take what CSE has developed and build upon it to benefit all Canadians.

How It Works

Assemblyline is a platform for the analysis of malicious files. It is designed to assist cyber defence teams to automate the analysis of files and to better use the time of security analysts. The tool recognizes when a large volume of files is received within the system, and can automatically rebalance its workload. Users can add their own analytics, such as antivirus products or custom-built software, in to Assemblyline. The tool is designed to be customized by the user and provides a robust interface for security analysts.

Assemblyline works very much like a conveyor belt. Files arrive in the system and are triaged in a certain sequence.

Assemblyline generates information about each file and assigns a unique identifier that travels with the file as it flows through the system.
Users can add their own analytics, which we refer to as services, to Assemblyline. The services selected by the user in Assemblyline then analyze the files, looking for an indication of maliciousness and/or extracting features for further analysis.
The system can generate alerts about a malicious file at any point during the analysis and assigns the file a score.
The system can also trigger automated defensive systems to kick in. Malicious indicators generated by the system can be distributed to other defence systems.
Assemblyline recognizes when a file has been previously analysed.

Users can deploy their own analytics, such as antivirus products or custom-built software into Assemblyline. It is designed to be customized by the user.

Assemblyline Example

A financial officer receives an email from an outside sender that includes a password-protected .zip file that contains a spreadsheet and a Word document with text for an annual report. An hour later the financial officer forwards that email to three colleagues within the department and attaches a .jpeg image of a potential cover for the report.

Assemblyline will start by examining the initial email. It automatically recognizes the various file formats (email, .zip file, spreadsheet, Word document) and triggers the analysis of each file. In this example, the Word document contains embedded malware, although the financial officer is unaware of this. The whole file is given a score when the analysis of each file is complete. Scores over a certain threshold trigger alerts, at which point a security analyst may manually examine the file. The malware within the Word document is neutralized due to further security measures that the organization has already implemented.

When the email is forwarded, Assemblyline automatically recognizes the duplication of files and focuses on new content that may be part of the email, such as the .jpeg image.

Assemblyline minimizes the number of non-malicious files that analysts have to manually inspect and allows users to focus their time and attention on the most harmful files.

The Strength of Assemblyline

The strength of Assemblyline is the ability of users to scale the system to their needs and the way that Assemblyline automatically rebalances its workload depending on the volume of files. It reduces the number of non-malicious files that security analysts have to inspect, and permits users to focus their time and attention on the most harmful files, allowing them to spend time researching new cyber defence techniques.

Development of the Tool

Assemblyline was built using public domain and open-source software; however the majority of the code was developed by CSE. It does not contain any commercial technology, but it is easily integrated in to existing cyber defence technologies. As open-source software, businesses can modify Assemblyline to suit their requirements.

Releasing Assemblyline to the Cyber Defence Community

Malicious files can allow threat actors to access sensitive systems, extract valuable data or corrupt vital services. Assemblyline will benefit small and large businesses by allowing them to better protect their data from theft and compromise. Most software of a similar nature is proprietary to a company and not available to the software development community. CSE is releasing Assemblyline to businesses, security researchers, industry, and academia, with no economic benefit to CSE. The release of Assemblyline benefits the country and CSE’s work to protect Canadian systems, and allows the cybersecurity community to build and evolve this valuable open-source software. The public release of Assemblyline enables malware security researchers to focus their efforts on creating new methods to detect malicious files.

via: https://www.cse-cst.gc.ca/en/assemblyline

Sign In

Assemblyline

Recommended Posts

Usr6

Assemblyline

Components

Assemblyline Core

Assemblyline Datastore

Assemblyline Workers

Assemblyline reference manual

Getting started

Use as an appliance

Deploy a production cluster

Development

Setup your development desktop

Clone repo

Clone other repos

Setup your development VM

Local VM

(Alternative) Amazon AWS or other cloud providers

Installing the assemblyline code on the dev VM

Finishing setup

Setting up pycharm

Remote interpreter (pro only)

Remote Deployment (PyCharm Pro only)

Create a new service

gutui

How It Works

Assemblyline Example

The Strength of Assemblyline

Development of the Tool

Releasing Assemblyline to the Cyber Defence Community

Join the conversation

Browse

Activity

Pages