Top 10: The Web Application Vulnerability Scanners Benchmark, 2012

Nytro · July 17, 2012

Commercial & Open Source Scanners

An Accuracy, Coverage, Versatility, Adaptability, Feature and Price Comparison of 60 Commercial & Open Source Black Box Web Application Vulnerability Scanners

By Shay Chen

Information Security Consultant, Researcher and Instructor

Security Tools Benchmarking, http://www.sectoolmarket.com/

sectooladdict-$at$-gmail-$dot$-com

July 2012

Assessment Environments: WAVSEP 1.2, ZAP-WAVE (WAVSEP integration), WIVET v3-rev148

Table of Contents

1. Introduction

2. List of Tested Web Application Scanners

3. Benchmark Overview & Assessment Criteria

4. A Glimpse at the Results of the Benchmark

5. Test I - Scanner Versatility - Input Vector Support

6. Test II – Attack Vector Support – Counting Audit Features

7. Introduction to the Various Accuracy Assessments

8. Test III – The Detection Accuracy of Reflected XSS

9. Test IV – The Detection Accuracy of SQL Injection

10. Test V – The Detection Accuracy of Path Traversal/LFI

11. Test VI – The Detection Accuracy of RFI (XSS via RFI)

12. Test VII - WIVET - Coverage via Automated Crawling

13. Test VIII – Scanner Adaptability - Crawling & Scan Barriers

14. Test IX – Authentication and Usability Feature Comparison

15. Test X – The Crown Jewel - Results & Features vs. Pricing

16. Additional Comparisons, Built-in Products and Licenses

17. What Changed?

18. Initial Conclusions – Open Source vs. Commercial

19. Verifying The Benchmark Results

20. So What Now?

21. Recommended Reading List: Scanner Benchmarks

22. Thank-You Note

23. FAQ - Why Didn't You Test NTO, Cenzic and N-Stalker?

24. Appendix A – List of Tools Not Included In the Test

1. Introduction

[TABLE]

[TR]

[TD]

Detailed Result Presentation at

http://www.sectoolmarket.com/'>http://www.sectoolmarket.com/

Tools, Features, Results, Statistics and Price Comparison

(Delete Cache)

[/TD]

[TD]

A Step by Step Guide for Choosing the Right Web Application Vulnerability Scanner for *You*

[Placeholder - Infosec Island]

[/TD]

[TD]

A Perfectionist Guide for Optimal Use of Web Application Vulnerability Scanners

[Placeholder]

[/TD]

[/TR]

[/TABLE]

Getting the information was the easy part. All I had to do was to invest a couple of years in gathering the list of tools, and a couple of more in documenting their various features. It's really a daily routine - you read a couple of posts in news groups in the morning, and couple blogs at the evening. Once you get used to it, it's fun, and even quite addictive.

Then came the "best" fantasy, and with it, the inclination to test the proclaimed features of all the web application vulnerability scanners against each other, only to find out that things are not that simple, and finding the "best", if there is such a tool, was not an easy task.

Inevitably, I tried searching for alternative assessment models, methods of measurements that will handle the imperfections of the previous assessments.

I tried to change the perspective, add tests (and hundreds of those - 940+, to be exact), examine different aspects, and even make parts of the test process obscure, and now, I'm finally ready for another shot.

In spite of everything I had invested in past researches, due to the focus I had on features and accuracy, and the policy I used when interacting with the various vendors, it was difficult, especially for me, to gain insights from the mass amounts of data that will enable me to choose, and more importantly, properly use the various tools in real life scenarios.

Is the most accurate scanner necessarily the best choice for a point and shoot scenario? and what good will it do if it can't scan an application due to a specific scan barrier it can't handle, or because if does not support the input delivery method?

I needed to gather other pieces of the puzzle, and even more importantly, I needed a method, or more accurately, a methodology.

I'm sorry to disappoint you, dear reader, so early in the article, but I still don't have a perfect answer or one recommendation... But I sure am much closer than I ever was, and although I might not have the answer, I have many answers, and a very comprehensive, logical and clear methodology for employing the use of all the information I'm about to present.

In the previous benchmarks , I focused on assessing 3 major aspects of web application scanners, which revolved mostly around features & accuracy, and even though the information was very interesting, it wasn't necessarily useful, at least not in all scenarios.

So decided to take it to the edge, but since I already reached the number of 60 scanners, it was hard to make an impression with a couple of extra tools, so instead, I focused my efforts on aspects.

This time, I compared 10 different aspects of the tools (or 14, if you consider non competitive charts), and chose the collection with the aim of providing practical tools for making a decision, and getting a glimpse of the bigger picture.

Let me assure you - this time, the information is presented in a manner that is very helpful, is easy to navigate, and is supported by presentation platforms, articles and step by step methodologies.

Furthermore, I wrapped it all in a summary that includes the major results and features in relation to the price, for those of us that prefer the overview, and avoid the drill down. Information and Insights that I believe, will help testers invest their time in better-suited tools, and consumers in properly investing their money, in the long term or the short term (but not necessarily both*).

As mentioned earlier, this research covers various aspects for the latest versions of 11 commercial web application scanners, and the latest versions of most of the 49 free & open source web application scanners. It also covers some scanners that were not covered in previous benchmarks, and includes, among others, the following components and tests:

• A Price Comparison - in Relation to the Rest of the Benchmark Results

• Scanner Versatility - A Measure for the Scanner's Support of Protocols & Input Delivery Vectors

• Attack Vector Support - The Amount & Type of Active Scan Plugins (Vulnerability Detection)

• Reflected Cross Site Scripting Detection Accuracy

• SQL Injection Detection Accuracy

• Path Traversal / Local File Inclusion Detection Accuracy

• Remote File Inclusion Detection Accuracy (XSS/Phishing via RFI)

• WIVET Score Comparison - Automated Crawling / Input Vector Extraction

• Scanner Adaptability - Complementary Coverage Features and Scan Barrier Support

• Authentication Features Comparison

• Complementary Scan Features and Embedded Products

• General Scanning Features and Overall Impression

• License Comparison and General Information

And just before we delve into the details, one last tip: don't focus solely on the charts - if you want to really understand what they reflect, dig in.

Lists and charts first, detailed description later.

2. List of Tested Web Application Scanners

The following commercial scanners were included in the benchmark:

IBM AppScan v8.5.0.1, Build 42-SR1434 (IBM)
WebInspect v9.20.277.0, SecureBase 4.08.00 (HP)
Netsparker v2.1.0, Build 45 (Mavituna Security)
Acunetix WVS v8.0, Build 20120613 (Acunetix)
Syhunt Dynamic (SandcatPro) v4.5.0.0/1 (Syhunt)
Burp Suite v1.4.10 (Portswigger)
ParosPro v1.9.12 (Milescan) - WIVET / Other
JSky v3.5.1-905 (NoSec) - WIVET / Other
WebCruiser v2.5.1 EE (Janus Security)
Nessus v5.0.1 - 20120701 (Tenable Network Security) - Web Scanning Features
Ammonite v1.2 (RyscCorp)

The following new free & open source scanners were included in the benchmark:

IronWASP v0.9.1.0

The updated versions of the following free & open source scanners were re-tested in the benchmark:

Zed Attack Proxy (ZAP) v1.4.0.1, sqlmap v1.0-Jul-5-2012 (Github), W3AF 1.2-rev509 (SVN), Acunetix Free Edition v8.0-20120509, Safe3WVS v10.1 FE (Safe3 Network Center) WebSecurify v0.9 (free edition - the new commercial version was not tested), Syhunt Mini (Sandcat Mini) v4.4.3.0, arachni v0.4.0.3, Skipfish 2.07b, N-Stalker 2012 Free Edition v7.1.1.121 (N-Stalker), Watobo v0.9.8-rev724 (a few new WATOBO 0.9.9 pre versions were released a few days before the publication of the benchmark, but I didn't managed to test them in time)

Different aspects of the following free & open source scanners were tested in the benchmark:

VEGA 1.0 beta (Subgraph), Netsparker Community Edition v1.7.2.13, Andiparos v1.0.6, ProxyStrike v2.2, Wapiti v2.2.1, Paros Proxy v3.2.13, Grendel Scan v1.0

The results were compared to those of unmaintained scanners tested in previous benchmarks:

PowerFuzzer v1.0, Oedipus v1.8.1 (v1.8.3 is around somewhere), Scrawler v1.0, WebCruiser v2.4.2 FE (corrections), Sandcat Free Edition v4.0.0.1, JSKY Free Edition v1.0.0, N-Stalker 2009 Free Edition v7.0.0.223, UWSS (Uber Web Security Scanner) v0.0.2, Grabber v0.1, WebScarab v20100820, Mini MySqlat0r v0.5, WSTool v0.14001, crawlfish v0.92, Gamja v1.6, iScan v0.1, LoverBoy v1.0, DSSS (Damn Simple SQLi Scanner) v0.1h, openAcunetix v0.1, ScreamingCSS v1.02, Secubat v0.5, SQID (SQL Injection Digger) v0.3, SQLiX v1.0, VulnDetector v0.0.2, Web Injection Scanner (WIS) v0.4, Xcobra v0.2, XSSploit v0.5, XSSS v0.40, Priamos v1.0, XSSer v1.5-1 (version 1.6 was released but I didn't manage to test it), aidSQL 02062011 (a newer revision exists in the SVN but was not officially released)

For a full list of commercial & open source tools that were not tested in this benchmark, refer to the appendix.

3. Benchmark Overview & Assessment Criteria

The benchmark focused on testing commercial & open source tools that are able to detect (and not necessarily exploit) security vulnerabilities on a wide range of URLs, and thus, each tool tested was required to support the following features:

· The ability to detect Reflected XSS and/or SQL Injection and/or Path Traversal/Local File Inclusion/Remote File Inclusion vulnerabilities.

· The ability to scan multiple URLs at once (using either a crawler/spider feature, URL/Log file parsing feature or a built-in proxy).

· The ability to control and limit the scan to internal or external host (domain/IP).

The testing procedure of all the tools included the following phases:

Feature Documentation

The features of each scanner were documented and compared, according to documentation, configuration, plugins and information received from the vendor. The features were then divided into groups, which were used to compose various hierarchal charts.

Accuracy Assessment

The scanners were all tested against the latest version of WAVSEP (v1.2, integrating ZAP-WAVE), a benchmarking platform designed to assess the detection accuracy of web application scanners, which was released with the publication of this benchmark. The purpose of WAVSEP’s test cases is to provide a scale for understanding which detection barriers each scanning tool can bypass, and which common vulnerability variations can be detected by each tool.

· The various scanners were tested against the following test cases (GET and POST attack vectors):

o

816

test cases that were vulnerable to

Path Traversal attacks

.

o

108

test cases that were vulnerable to

Remote File Inclusion

(XSS via RFI)

attacks.

o

66

test cases that were vulnerable to

Reflected Cross Site Scripting

attacks.

o

80

test cases that contained

Error Disclosing SQL Injection

exposures.

o

46

test cases that contained

Blind SQL Injection

exposures.

o

10

test cases that were vulnerable to

Time Based SQL Injection

attacks.

o

7

different categories of

false positive

RXSS vulnerabilities.

o

10

different categories of

false positive

SQLi vulnerabilities.

o

8

different categories of

false positive

Path Travesal / LFI vulnerabilities.

o

6

different categories of

false positive

Remote File Inclusion vulnerabilities.

· The benchmark included 8 experimental RXSS test cases and 2 experimental SQL Injection test cases, and although the scan results of these test cases were documented in the various scans, their results were not included in the final score, at least for now.

· In order to ensure the result consistency, the directory of each exposure sub category was individually scanned multiple times using various configurations, usually using a single thread and using a scan policy that only included the relevant plugins.

In order to ensure that the detection features of each scanner were truly effective, most of the scanners were tested against an additional benchmarking application that was prone to the same vulnerable test cases as the WAVSEP platform, but had a different design, slightly different behavior and different entry point format, in order to verify that no signatures were used, and that any improvement was due to the enhancement of the scanner's attack tree.

Attack Surface Coverage Assessment

In order to assess the scanners attack surface coverage, the assessment included tests that measure the efficiency of the scanner's automated crawling mechanism (input vector extraction) , and feature comparisons meant to assess its support for various technologies and its ability to handle different scan barriers.

This section of the benchmark also included the WIVET test (Web Input Vector Extractor Teaser), in which scanners were executed against a dedicated application that can assess their crawling mechanism in the aspect of input vector extraction. The specific details of this assessment are provided in the relevant section.

Public tests vs. Obscure tests

In order to make the test as fair as possible, while still enabling the various vendors to show improvement, the benchmark was divided into tests that were publically announced, and tests that were obscure to all vendors:

· Publically announced tests: the active scan feature comparison, and the detection accuracy assessment of the SQL Injection and Reflected Cross Site Scripting, composed out of tests cases which were published as a part of WAVSEP v1.1.1)

· Tests that were obscure to all vendors until the moment of the publication: the various new groups of feature comparisons, the WIVET assessment, and the detection accuracy assessment of the Path Traversal / LFI and Remote File Inclusion (XSS via RFI), implemented as 940+ test cases in WAVSEP 1.2 (a new version that was only published alongside this benchmark).

The results of the main test categories are presented within three graphs (commercial graph, free & open source graph, unified graph), and the detailed information of each test is presented in a dedicated section in benchmark presentation platform at http://www.sectoolmarket.com.

Now that were finally done with the formality, let's get to the interesting part... the results.

4. A Glimpse to the Results of the Benchmark

This presentation of results in this benchmark, alongside the dedicated website (http://www.sectoolmarket.com/) and a series of supporting articles and methodologies ([placeholder]), are all designed to help the reader to make a decision - to choose the proper product/s or tool/s for the task at hand, within the borders of the time or budget.

For those of us that can't wait, and want to get a glimpse to the summary of the unified results, there is a dedicated page available at the following links:

Price & Feature Comparison of Commercial Scanners

List of Tested Scanners

Price & Feature Comparison of a Unified List of Commercial, Free and Open Source Products

List of Tested Scanners

Some of the sections might not be clear to some of the readers at this phase, which is why I advise you to read the rest of the article, prior to analyzing this summary.

5. Test I - Scanner Versatility - Input Vector Support

The first assessment criterion was the number of input vectors each tool can scan (and not just parse).

Modern web applications use a variety of sub-protocols and methods for delivering complex inputs from the browser to the server. These methods include standard input delivery methods such as HTTP querystring parameters and HTTP body parameters, modern delivery methods such as JSON and XML, and even binary delivery methods for technology specific objects such as AMF, Java serialized objects and WCF.

Since the vast majority of active scan plugins rely on input that is meant to be injected into client originating parameters, supporting the parameter (or rather, the input) delivery method of the tested application is a necessity.

Although the charts in this section don't necessarily represent the most important score, it is the most important perquisite for the scanner to comply with when scanning a specific technology.

Reasoning: An automated tool can't detect a vulnerability in a given parameter, if it can't scan the protocol or mimic the application's method of delivering the input. The more vectors of input delivery that the scanner supports, the more versatile it is in scanning different technologies and applications (assuming it can handle the relevant scan barriers, supports necessary features such as authentication, or alternatively, contains features that can be used to work around the specific limitations).

The detailed comparison of the scanners support for various input delivery methods is documented in detail in the following section of sectoolmarket (recommended - too many scanners in the chart):

The Input Vector Support of Web Application Scanners

The following chart shows how versatile each scanner is in scanning different input delivery vectors (and although not entirely accurate - different technologies):

The Number of Input Vectors Supported – Commercial Tools

The Number of Input Vectors Supported – Free & Open Source Tools

The Number of Input Vectors Supported – Unified List

6. Test II – Attack Vector Support – Counting Audit Features

The second assessment criterion was the number of audit features each tool supports.

Reasoning: An automated tool can't detect an exposure that it can't recognize (at least not directly, and not without manual analysis), and therefore, the number of audit features will affect the amount of exposures that the tool will be able to detect (assuming the audit features are implemented properly, that vulnerable entry points will be detected, that the tool will be able to handle the relevant scan barriers and scanning perquisites, and that the tool will manage to scan the vulnerable input vectors).

For the purpose of the benchmark, an audit feature was defined as a common generic application-level scanning feature, supporting the detection of exposures which could be used to attack the tested web application, gain access to sensitive assets or attack legitimate clients.

The definition of the assessment criterion rules out product specific exposures and infrastructure related vulnerabilities, while unique and extremely rare features were documented and presented in a different section of this research, and were not taken into account when calculating the results. Exposures that were specific to Flash/Applet/Silverlight and Web Services Assessment (with the exception of XXE) were treated in the same manner.

The detailed comparison of the scanners support for various audit features is documented in detail in the following section of sectoolmarket:

Web Application Scanners Audit Features Comparison

The Number of Audit Features in Web Application Scanners – Commercial Tools

The Number of Audit Features in Web Application Scanners – Free & Open Source Tools

The Number of Audit Features in Web Application Scanners – Unified List

So once again, now that were done with the quantity, let's get to the quality…

7. Introduction to the Various Accuracy Assessments

The following sections presents the results of the detection accuracy assessments performed for Reflected XSS, SQL Injection, Path Traversal and Remote File Inclusion (RXSS via RFI) - four of the most commonly supported features in web application scanners. Although the detection accuracy of a specific exposure might not reflect the overall condition of the scanner on its own, it is a crucial indicator for how good a scanner is at detecting specific vulnerability instances.

The various assessments were performed against the various test cases of WAVSEP v1.2, which emulate different common test case scenarios for generic technologies.

Reasoning: a scanner that is not accurate enough will miss many exposures, and might classify non-vulnerable entry points as vulnerable. These tests aim to assess how good is each tool at detecting the vulnerabilities it claims to support, in a supported input vector, which is located in a known entry point, without any restrictions that can prevent the tool from operating properly.

8. Test III – The Detection Accuracy of Reflected XSS

The third assessment criterion was the detection accuracy of Reflected Cross Site Scripting, a common exposure which is the 2nd most commonly implemented feature in web application scanners, and the one in which I noticed the greatest improvement in the various tested web application scanners.

The comparison of the scanners' reflected cross site scripting detection accuracy is documented in detail in the following section of sectoolmarket:

Reflected Cross Site Scripting Detection Accuracy - Summary

Result Chart Glossary

Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).

The Reflected XSS Detection Accuracy of Web Application Scanners – Commercial Tools

The Reflected XSS Detection Accuracy of Web Application Scanners – Open Source & Free Tools

The Reflected XSS Detection Accuracy of Web Application Scanners – Unified List

9. Test IV – The Detection Accuracy of SQL Injection

The fourth assessment criterion was the detection accuracy of SQL Injection, one of the most famous exposures and the most commonly implemented attack vector in web application scanners.

The evaluation was performed on an application that uses MySQL 5.5.x as its data repository, and thus, will reflect the detection accuracy of the tool when scanning an application that uses similar data repositories.

The comparison of the scanners' SQL injection detection accuracy is documented in detail in the following section of sectoolmarket:

SQL Injection Detection Accuracy - Summary

Result Chart Glossary

Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).

The SQL Injection Detection Accuracy of Web Application Scanners – Commercial Tools

The SQL Injection Detection Accuracy of Web Application Scanners – Open Source & Free Tools

The SQL Injection Detection Accuracy of Web Application Scanners – Unified List

Although there are many changes in the results since the last benchmark, both of these exposures (SQLi, RXSS) were previously assessed, so, I believe it's time to introduce something new... something none of the tested vendors could have prepared for in advance...

10. Test V – The Detection Accuracy of Path Traversal/LFI

The fifth assessment criterion was the detection accuracy of Path Traversal (a.k.a Directory Traversal), a newly implemented feature in WAVSEP v1.2, and the third most commonly implemented attack vector in web application scanners.

The reason it was tagged along with Local File Inclusion (LFI) is simple - many scanners don't make the differentiation between inclusion and traversal, and furthermore, a few online vulnerability documentation sources don't. In addition, the results obtained from the tests performed on the vast majority of tools lead to the same conclusion - many plugins listed under the name LFI detected the path traversal plugins.

While implementing the path traversal test cases and consuming nearly every relevant piece of documentation I could find on the subject, I decided to take the current path, in spite of some acute differences some of the documentation sources suggested (but did implemented an infrastructure in WAVSEP for "true" inclusion exposures).

The point is not to get into a discussion of whether or not path traversal, directory traversal and local file inclusion should be classified as the same vulnerability, but simply to explain why in spite of the differences some organizations / classification methods have for these exposures, they were listed under the same name (In sectoolmarket - path traversal detection accuracy is listed under the title LFI).

The evaluation was performed on a WAVSEP v1.2 instance that was hosted on windows XP, and although there are specific test cases meant to emulate servers that are running with a low privileged OS user accounts (using the servlet context file access method), many of the test cases emulate web servers that are running with administrative user accounts.

[Note - in addition to the wavsep installation, to produce identical results to those of this benchmark, a file by the name of content.ini must be placed in the root installation directory of the tomcat server- which is different than the root directory of the web server]

Although I didn't perform the path traversal scans on Linux for all the tools, I did perform the initial experiments on Linux, and even a couple of verifications on Linux for some of the scanners, and as weird as it sounds, I can clearly state that the results were significantly worse, and although I won't get the opportunity to discuss the subject in this benchmark, I might handle it in the next.

In order to assess the detection accuracy of different path traversal instances, I designed a total of 816 OS-adapting path traversal test cases (meaning - the test cases adapt themselves to the OS they are executed in, and to the server they are executed in, in the aspects of file access delimiters and file access paths). I know it might seem a lot, and I guess I did got carried away with the perfectionism, but you will be surprised too see that these tests really represent common vulnerability instances, and not necessarily super extreme scenarios, and that results of the tests did prove the necessity.

The tests were deigned to emulate various combination of the following conditions and restrictions:

If you will take a closer look at the detailed scan-specific results at www.sectoolmarket.com, you'll notice that some scanners were completely unaffected by the response content type and HTTP code variation, while other scanners were dramatically affected by the variety (gee, it's nice to know that I didn't write them all for nothing... ).

In reality, there were supposed to more test cases, primarily because I intended to test injection entry points in which the input only affected the filename without the extension, or was injected directly into the directory name. However, due to the sheer amount of tests and the deadline I had for this benchmark, I decided to delete (literally) the test cases that handled these anomalies, and focus on test cases in which the entire filename/path was affected. That being said, I might publish these test cases in future versions of wavsep (they amount to a couple of hundreds).

The comparison of the scanners' path traversal detection accuracy is documented in detail in the following section of sectoolmarket:

Path Traversal / Local File Inclusion Detection Accuracy - Summary

Result Chart Glossary

Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).

The Path Traversal / LFI Detection Accuracy of Web Application Scanners – Commercial Tools

..............................................

Posted by Shay-Chen at 9:24 AM

ARTICOL COMPLET:

http://sectooladdict.blogspot.co.uk/

Sign In

Top 10: The Web Application Vulnerability Scanners Benchmark, 2012

Recommended Posts

Nytro

Join the conversation

Browse

Activity

Pages