Spam crawling ‘bots and AntiLeech
I just happened to be looking through my blog’s logs and noticed that a ‘bot had crawled through numerous pages on my blog in a very short period of time:
72.3.137.83 - - [13/Aug/2008:06:01:45 +0100] "GET /2008/02/01/ HTTP/1.0" 200 35253 "-" "ISC Systems iRc Search 2.1"
72.3.137.83 - - [13/Aug/2008:06:01:49 +0100] "GET /2008/02/05 HTTP/1.0" 301 84 "-" "ISC Systems iRc Search 2.1"
72.3.137.83 - - [13/Aug/2008:06:01:52 +0100] "GET /2008/02/05/ HTTP/1.0" 200 35375 "-" "ISC Systems iRc Search 2.1"
72.3.137.83 - - [13/Aug/2008:06:01:56 +0100] "GET /2008/02/06 HTTP/1.0" 301 84 "-" "ISC Systems iRc Search 2.1"
72.3.137.83 - - [13/Aug/2008:06:01:59 +0100] "GET /2008/02/06/ HTTP/1.0" 200 33373 "-" "ISC Systems iRc Search 2.1"
The “ISC Systems iRc Search 2.1” user agent caught my interest, so I did a little research with Google. As I suspected, it seems that this user agent is associated with a web crawler used by an address harvester used by spammers. I use the AntiLeech plugin to battle content thieves and the like, so I added the user agent to its blacklist.
But how to tell if AntiLeech is actually working?