Register Members List Search Today's Posts Mark Forums Read

Reply
 
Mod Options
Ban Spiders by User Agent Details »
Ban Spiders by User Agent
Mod Version: 3.1.2, by Simon Lloyd (Coder) Simon Lloyd is offline
Developer Last Online: Jan 2020 I like it Show Printable Version Email this Page

vB Version: 4.x.x Rating: (64 votes - 4.86 average) Installs: 486
Released: 09 Aug 2011 Last Update: 18 Dec 2014 Downloads: 1928
Supported Uses Plugins  

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
http://www.vbulletin.org/forum/showp...&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
http://www.vbulletin.org/forum/showp...&postcount=137

How does it work?
http://www.vbulletin.org/forum/showp...&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
http://www.vbulletin.org/forum/showp...&postcount=318
http://www.vbulletin.org/forum/showp...7&postcount=51

Where's output.txt located?
http://www.vbulletin.org/forum/showp...&postcount=216

Bad bot lists
http://www.vbulletin.org/forum/showp...&postcount=259
http://www.vbulletin.org/forum/showp...&postcount=224
http://www.vbulletin.org/forum/showp...&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

____________________________________________________________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

Only licensed members can download files, Click Here for more information.

Supporters / CoAuthors

Show Your Support

  • To receive notifications regarding updates -> Click to Mark as Installed.
  • If you like this modification support the author by donating.
  • This modification may not be copied, reproduced or published elsewhere without author's permission.
Similar Mod
Mod Developer Type Replies Last Post
Miscellaneous Hacks Ban Spiders by User Agent Simon Lloyd vBulletin 3.8 Add-ons 188 20 Jul 2015 13:34

  #481  
Old 11 Feb 2013, 18:20
fly fly is offline
 
Join Date: Oct 2003
Originally Posted by Max Taxable View Post
Amazon AWS is their hosting they sell. And yes they also crawl the web: http://aws.amazon.com/search-engines/

I have it blocked as well, using this Mod.
Did you read that? There is nowhere in that link that says that Amazon themselves crawl websites. Can you even explain why a hosting company would want to catalog data from every website on the internet?

I'm wondering if there is some confusion on what a user agent is and does. The UA is the remote web crawlers way of tell you that it is there cataloging your site. It's not required that a crawler send you a UA at all. Instead, its just considered polite. If someone wanted to, they could send a completely random UA every time or not send one at all.

Since Amazon AWS is in the hosting business, they have no need to crawl websites at all. However, this doesn't PREVENT people from buying their own server from Amazon and crawling your website. If someone were to do this, the UA would be whatever they wanted it to be, not some form of "AmazonAWS".

Assuming what you're really trying to do is prevent anyone from buying a server from Amazon and accessing your website, you'll need to find all the IP blocks that AWS owns and block those. However, that is outside the scope of this mod.
Reply With Quote
  #482  
Old 11 Feb 2013, 20:58
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Real name: Simon
For reference here's what a user agent is and some extra info http://en.wikipedia.org/wiki/User_agent. All this mod is designed to do is stop bots from eating up your bandwidth by redirecting them before any content loads. To be honest you can never stop anyone who is intent on scraping your site from doing so.
__________________
Kind regards,
Simon Microsoft Office Help
My Mods: Find my modifications here
Please do not pm me for support unless i have invited you to!
Reply With Quote
  #483  
Old 12 Feb 2013, 00:00
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Originally Posted by fly View Post
Did you read that? There is nowhere in that link that says that Amazon themselves crawl websites. Can you even explain why a hosting company would want to catalog data from every website on the internet?

I'm wondering if there is some confusion on what a user agent is and does. The UA is the remote web crawlers way of tell you that it is there cataloging your site. It's not required that a crawler send you a UA at all. Instead, its just considered polite. If someone wanted to, they could send a completely random UA every time or not send one at all.

Since Amazon AWS is in the hosting business, they have no need to crawl websites at all. However, this doesn't PREVENT people from buying their own server from Amazon and crawling your website. If someone were to do this, the UA would be whatever they wanted it to be, not some form of "AmazonAWS".

Assuming what you're really trying to do is prevent anyone from buying a server from Amazon and accessing your website, you'll need to find all the IP blocks that AWS owns and block those. However, that is outside the scope of this mod.
The "amazonaws" crawlers have that designation in their UA string. Anything else coming from Amazon has it in its host description.

The rest of your missive, I am well aware of.
Reply With Quote
  #484  
Old 12 Feb 2013, 01:55
fly fly is offline
 
Join Date: Oct 2003
ok.
Reply With Quote
  #485  
Old 03 Mar 2013, 02:39
Inspector G Inspector G is offline
 
Join Date: Dec 2012
I have a confusing question...
Ok I have a very small member site...like 24 members...
So when I noticed I had 35 users online most of the time and I started seeing more and more baidu spiders
I decided to do something about it...

I installed this mod.
almost instantly ...well within say 3 hours my users online soared to well over 150 on busy times like Now...tonight.
I had
Most users ever online was 247, 1 Day Ago at 12:58 AM.

With only one new account created, and maybe me or one other registered user online...

My question is this. what happened when I installed this mod to make such a drastic change in the users on my site and why?

I do not understand this and I read that the server load increases...
I find it hard to believe that anyone is finding my site via a search engine since it is a brand new .cc name and it has only been online for two months now...

Is there something about pushing away Baidu that enables more sites to come, or Spam bots?
attempting to register and what not, many are in areas that there would not be a normal user.

I see many attempts a registering and yet no more new users.,.. so I believe those are bots locking...

Please advise...
Reply With Quote
  #486  
Old 03 Mar 2013, 04:42
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Real name: Simon
What's happening is (and you'll probably find this) is because Baidu can't get in with the spiders/ip's they were using they are now trying a rotation of other ip's and bots, i use this mod myself although i don't ban the bots as i monitor their visits to further enhance any mod i make against them, i currently have 236 baidu bots (and 140 other bots/search engines) at my site.

With the mod in place and redirection working you'll find that these bots that you have banned will slowly drop off as they all get the message of the 301 permananet redirect to wherever you've decided to send them, your server load will lessen and things will be more normal
__________________
Kind regards,
Simon Microsoft Office Help
My Mods: Find my modifications here
Please do not pm me for support unless i have invited you to!
Reply With Quote
  #487  
Old 03 Mar 2013, 04:44
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Real name: Simon
Also do you have your robots.txt set up correctly to stop the search engines or bots that obey robots.txt from indexing pages on your site that they shouldn't like register.php, members.php ....etc?
__________________
Kind regards,
Simon Microsoft Office Help
My Mods: Find my modifications here
Please do not pm me for support unless i have invited you to!
Reply With Quote
  #488  
Old 03 Mar 2013, 04:59
Inspector G Inspector G is offline
 
Join Date: Dec 2012
I did not understand how to do the text part since I am what I even call very green in this aspect of Vbulleting...
so I just installed the mod...
I can wait and see if it drops off and report back...
Thanks for the help in understanding...
Reply With Quote
  #489  
Old 03 Mar 2013, 06:34
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Real name: Simon
Ok, what you need to do is upload the attached to your forum root, however if your forum is at this level www.mysite.com/ then edit the attached to remove /forums if your forum is at this level www.mysite.com/forums then you can just upload it to that folder.

You can add any page or file to robots.txt that you wish, just follow the same structure
Attached Files
File Type: txt robots.txt (1.4 KB, 24 views)
__________________
Kind regards,
Simon Microsoft Office Help
My Mods: Find my modifications here
Please do not pm me for support unless i have invited you to!
Reply With Quote
  #490  
Old 03 Mar 2013, 06:56
Inspector G Inspector G is offline
 
Join Date: Dec 2012
Well thanks Simon...
Thats really nice...
I will do so immediately.
Nice to see someone really help out the Noob...lol
Thanks again I appreciate this very much...
I will report back.
Reply With Quote
  #491  
Old 03 Mar 2013, 06:59
Inspector G Inspector G is offline
 
Join Date: Dec 2012
So I think what you are telling me is this...
Since my site forum is at root level to edit as follows...
This...Disallow: /forums/albums.php
to This...Disallow: /albums.php
Reply With Quote
  #492  
Old 03 Mar 2013, 08:40
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Real name: Simon
yes if your forum isn't in a folder but simply "on your server" so you dont need to access a folder to get to it then thats correct!
__________________
Kind regards,
Simon Microsoft Office Help
My Mods: Find my modifications here
Please do not pm me for support unless i have invited you to!
Reply With Quote
  #493  
Old 31 Mar 2013, 18:37
dog-tag's Avatar
dog-tag dog-tag is offline
 
Join Date: Jan 2012
After being only installed 10 minutes, I've seen a 20% drop in server load already. I was already blocking them with .htaccess but they were still getting in. According to AWstats bots have been hitting my server MILLIONS of times per month.

Thank you very much from the bottom of my heart, you're very talented!
Reply With Quote
  #494  
Old 31 Mar 2013, 19:44
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Real name: Simon
You're welcome, dont forget to remove them from /htaccess now as they will be adding load just being there
__________________
Kind regards,
Simon Microsoft Office Help
My Mods: Find my modifications here
Please do not pm me for support unless i have invited you to!
Reply With Quote
  #495  
Old 01 Apr 2013, 22:21
datoneer datoneer is offline
 
Join Date: Jul 2011
Thank you good mod
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Mod Options

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


New To Site? Need Help?

All times are GMT. The time now is 03:30.

Layout Options | Width: Wide Color: