Misbehaving bots may access areas of your web site even if you tell then not to. That's what we key on here.
First, change to your web root folder. Possibly
public_html
.
Edit robots.txt and add a Disallow
line under User-agent: *
:
cd public_html vi robots.txt User-agent: * Disallow: /bot-trap/
We use a tiny image for embedding the hidden link. blank.png
should be just a single pixel transparent image and is unimportant (and may already exist). If it doesn't already exist, you can get blank.png
like this:
cd images wget http://www.sonoracomm.com/images/blank.png cd ..
Edit your HTML header (<head>
section) (index.html
, templates/yourtemplatename/index.php
, etc.) and add this line (modify as necessary):
vi index.html <a href="/bot-trap/"><img src="images/blank.png" border="0" alt=" " width="1" height="1"></a>
Now we create the bot-trap
folder and a file so as not to pollute the error logs. Change the URL to your own domain and the permissions as necessary:
mkdir bot-trap cat << EOF >> bot-trap/index.html <html> <head><title> </title></head> <body> <p>This is a spambot trap. You shouldn't normally ever see this...</p> <p><a href="http://www.sonoracomm.com/">Home Page</a></p> </body> </html> EOF chown -R apache.apache bot-trap
Add another regex to the fail2ban badbots
filter:
vi /etc/fail2ban/filter.d/apache-badbots.conf failregex = ^<HOST> -.*"(GET|POST).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$ ^<HOST> -.*"GET /bot-trap/
Be sure to enable the apache-badbots
stanza in /etc/fail2ban/jail.local
and restart Fail2Ban:
service fail2ban restart
Modify your log path as necessary:
fail2ban-regex ../logs/access_log /etc/fail2ban/filter.d/apache-badbots.conf
Check the Fail2Ban log:
tail -f /var/log/fail2ban.log
If Fail2Ban fails to parse your log files at all, try setting
backend=polling
in jail.local
.
iptables -nL |grep "DROP all" |tr -s ' ' | cut -d " " -f4|grep -v '0.0.0.0/0' |uniq |sort -n > botlist.txt