Nytro Posted July 9, 2014 Report Posted July 9, 2014 How to Block Automated Scanners from Scanning your Site By Bogdan Calin on JUL 09, 2014 - 08:30am This blog post describes how to block automated scanners from scanning your website. This should work with any modern web scanner parsing robots.txt (all popular web scanners do this). Website owners use the robots.txt file to give instructions about their site to web robots. The "/robots.txt" file is a text file, with one or more records, and usually contains a list of directories/URLs that should not be indexed by search engines.User-agent: *Disallow: /cgi-bin/In this example, search engines are instructed not to index the directory /cgi-bin/. Web scanners parse and crawl all the URLs from robots.txt because administrators usually place administrative interfaces or other high-value resources (that could be very interesting from a security point of view) there. To configure our automated scanner honeypot, we'll start by adding an entry in robots.txt to a directory. This directory does not contain any other content used by the site.User-agent: *Disallow: /dontVisitMe/Normal visitors would not know about this directory since it is not linked from the site. Also, search engines would not visit this directory since they are restricted from robots.txt. Only web scanners or curious people would know about this directory. This file contains a hidden form that is only visible to web scanners as they ignore CSS stylesheets. The HTML of this page is similar to:In the sample code, we placed a div that includes a form where the div is set to be invisible. The form has a single input, and is pointing to a file named dontdoit.php. When this page is visited by normal visitors with a browser, they will see the following: A web scanner, on the other hand, will see something else because it ignores CSS style: The web scanner will submit this form and start testing the form inputs with various payloads looking for vulnerabilities. At this point if somebody visits the file dontdoit.php, it's either a web scanner or a hacker. We have two options: either log the hacking attempt or automatically block the IP address making the request. If we want to automatically block the IP address making the request we could use a tool such as fail2ban.Fail2ban scans log files (e.g. /var/log/apache/error_log) and bans IPs that show malicious signs, such as too many password failures, and exploit seeking.We could log all the requests to dontdoit.php into a log file and configure fail2ban to parse this log file and automatically temporary block from the firewall IP addresses listed in this log file . Sample fail2ban configuration: /etc/fail2ban/jail.conf [block-automated-scanners]enabled = trueport = 80,443filter = block-automated-scanners# path to your logslogpath = /var/log/apache2/automated-scanners.log# max number of hitsmaxretry = 1# bantime in seconds - set negative to ban forever (18000 = 5 hours)bantime = 18000# actionaction = iptables-multiport[name=HTTP, port="80,443", protocol=tcp]sendmail-whois[name=block-automated-scanners, dest=admin@site.com, sender=fail2ban@site.com] /etc/fail2ban/filter.d/block-automated-scanners.conf # Fail2Ban configuration file## Author: Bogdan Calin#[Definition] # Option: failregexfailregex = \[hit from <HOST>\] # Option: ignoreregex# Notes.: regex to ignore. If this regex matches, the line is ignored.# Values: TEXT#ignoreregex =Sursa: http://www.acunetix.com/blog/web-security-zone/block-automated-scanners/ Quote