Dandello wrote on Dec 10
th, 2011 at 4:14am:
Tell us more about the referrer statement blasting them into the ether.
a bit complicated but when i started my other boards about 10 years ago i did some security coding to keep bad bots and other stuff from getting into directories or harvesting stuff. it has a series of cgi coding and SSI included exec coding that dumps them into the htaccess file with a statement blocking them. I recently added these lines to get anyone trying to hit the yabb cgi file:
placed in the htaccess file
#blocking spammers for yabb
RedirectMatch YaBB.cgi http://mydomain/cgi-bin/trap.cgi
trap.cgi file is a file that autowrites stuff
I have a bunch of coding that writes log files to keep track of what it is doing but the main portion of the trap.cgi file looks like this:
$test=0; # if test ==1 then it wont write the htaccess
$basedir = $ENV{DOCUMENT_ROOT};
$htafile = "/\.htaccess";
$termsfile = "/badbot\.htm";
$digestfile = "/spam\.dat";
# Form full pathname to .htaccess file
$htapath = "$basedir"."$htafile";
# Form full pathname to terms.htm file
$termspath = "$basedir"."$termsfile";
$digestpath = "$basedir"."$digestfile";
&trapem;
exit;
sub trapem{
# Get the bad-bot's IP address, convert to regular-expressions
#(regex) format by escaping all periods.
$remaddr = $ENV{REMOTE_ADDR};
$remaddr =~ s/\./\\\./gi;
# Get User-agent & current time
$usragnt = $ENV{HTTP_USER_AGENT};
$date = scalar localtime(time);
# Open the .htaccess file and wait for an exclusive lock. This
# prevents multiple instances of this script from running past
# the flock statement, and prevents them from trying to read and
# write the file at the same time, which would corrupt it.
# When .htaccess is closed, the lock is released.
#
# Open existing .htaccess file in r/w append mode, lock it, rewind
# to start, read current contents into array.
if($test ne "1"){
open(HTACCESS,"+>>$htapath") || die $!;
flock(HTACCESS,2);
seek(HTACCESS,0,0);
@contents = <HTACCESS>;
# Empty existing .htaccess file, then write new IP ban line and
# previous contents to it
truncate(HTACCESS,0);
print HTACCESS ("SetEnvIf Remote_Addr \^$remaddr\$ getout \# $date $usragnt\n");
print HTACCESS (@contents);
# close the .htaccess file, releasing lock - allow other instances
# of this script to proceed.
close(HTACCESS);
}
}
I have stuff written in my script files that bad bots look for that you can't see unless you are parsing the sourcecode. if a bad bot touches it it sends them to this trap file and bans them also.
The end .htaccess file product looks like this at the top:
SetEnvIf Remote_Addr ^31\.214\.144\.222$ getout # Sat Dec 10 10:57:09 2011 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
SetEnvIf Remote_Addr ^173\.193\.219\.168$ getout # Mon Dec 5 14:16:35 2011 Aboundex/0.2 (http://www.aboundex.com/crawler/)
SetEnvIf Remote_Addr ^74\.202\.210\.158$ getout # Sat Dec 3 23:09:47 2011 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100721 Firefox/3.6.8
SetEnvIf Remote_Addr ^69\.4\.231\.201$ getout # Mon Nov 28 20:08:16 2011 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
This is in the root .htaccess file:
# Block bad-bots using lines written by trap.cgi script above
SetEnvIf Request_URI cgi)$" allowsome
<Files *>
order deny,allow
allow from env=allowsome
deny from env=getout
Deny from env=spam_bot
</Files>
it then logs their entry, sends me an email and emails their host with an abuse letter.
Also, I have these presets in the root .htaccess file:
SetEnvIfNoCase User-Agent "EmailCollector/1.0" spam_bot
SetEnvIfNoCase User-Agent "EmailSiphon" spam_bot
SetEnvIfNoCase User-Agent "EmailWolf 1.00" spam_bot
SetEnvIfNoCase User-Agent "Crescent Internet ToolPak HTTP OLE Control v.1.0" spam_bot
SetEnvIfNoCase User-Agent "ExtractorPro" spam_bot
SetEnvIfNoCase User-Agent "Mozilla/2.0 (compatible; NEWT ActiveX; Win32)" spam_bot
SetEnvIfNoCase User-Agent "/0.5 libwww-perl/0.40" spam_bot
SetEnvIfNoCase User-Agent "CherryPickerElite/1.0" spam_bot
SetEnvIfNoCase User-Agent "CherryPickerSE/1.0" spam_bot
SetEnvIfNoCase User-Agent "WebBandit/2.1" spam_bot
SetEnvIfNoCase User-Agent "WebBandit/3.50" spam_bot
SetEnvIfNoCase User-Agent "Webbandit/4.00.0" spam_bot
SetEnvIfNoCase User-Agent "Indy Library" spam_bot
SetEnvIfNoCase User-Agent "Internet Explore 5.x" spam_bot
SetEnvIfNoCase User-Agent "Microsoft URL Control - 6.00.8862" spam_bot
SetEnvIfNoCase User-Agent "Mozilla/3.0 (compatible; Indy Library)" spam_bot
SetEnvIfNoCase User-Agent "Java1.3.1" spam_bot
SetEnvIfNoCase User-Agent "URL_Spider_Pro/3.0 (http://www.innerprise.net/usp-spider.asp)" spam_bot
SetEnvIfNoCase User-Agent "IPiumBot laurion(dot)com" spam_bot
SetEnvIfNoCase User-Agent "URL_Spider_SQL/1.0" spam_bot
SetEnvIfNoCase User-Agent "Lynx/2.8.4rel.1 libwww-FM/2.14 ... human-guided@lerly.net" spam_bot
SetEnvIfNoCase User-Agent "W3CRobot/5.4.0 libwww/5.4.0" spam_bot
SetEnvIfNoCase User-Agent "Zeus 2.6" spam_bot
SetEnvIfNoCase User-Agent "Rover" spam_bot