Page Index Toggle Pages: 1
Topic Tools
Normal Topic Blocking spiderbots from profile pages (Read 8,447 times)
Dandello
YaBB Administrator
YaBB Next Team
Operations Team
Beta Testers
Support Team
*****
Offline



Posts: 2,383
Location: Earth

YaBB 2.6.1
Blocking spiderbots from profile pages
Sep 8th, 2011 at 10:51pm
Post Tools
In my robots.txt I had:
Code
Select All
User-agent: *
Disallow: /pix/
Disallow: /aspects/pix/
Disallow: /cgi-bin/
 



And this has done an adequate job of keeping spiders out of my scripts. But now I'd like to allow Google and the others to spider the forum but ignore the profile pages - which means I have to allow the cgi-bin to be spidered.

Like any good Perl Monger, I have at least two ways I can see of doing this - figure out how to ban them using robots.txt or put
Code
Select All
<!-- robots content="noindex" --><!-- /robots --> 

around the profile links.

Has anyone been successful at this? Is there already security in place so I don't need to worry?  Huh

And yes - I know that only that good robots actually read things addressed to them and obey.  Wink
  

If you only have one solution to a problem you're not trying hard enough!
Back to top
WWW  
IP Logged
 
Tony Barnett
YaBB Newcomer
*
Offline



Posts: 26
Location: Mansfield, Victoria, Australia

YaBB 2.6.0
Re: Blocking spiderbots from profile pages
Reply #1 - Sep 9th, 2011 at 2:02am
Post Tools
For the following suggestion, I am assuming the Members list is available to all visitors to your forum and that this is where you have links through to your profile pages?

If so, you can set the forum so that profiles aren't available via the members list to anyone unless they are logged in --- i.e. a spider won't be able to follow a link to a profile because they can't log in!

To do this simply set "Select the Member Groups allowed to view the Member List?" to "All Members", "Mods, GMods & Admins" or "GMods & Admins".

i.e. Not "Guests and All Members".

You will find this setting at the bottom of the Forums Settings --> Members page in the Admin Center.

I hope this helps?
  

Back to top
IP Logged
 
Dandello
YaBB Administrator
YaBB Next Team
Operations Team
Beta Testers
Support Team
*****
Offline



Posts: 2,383
Location: Earth

YaBB 2.6.1
Re: Blocking spiderbots from profile pages
Reply #2 - Sep 9th, 2011 at 4:17am
Post Tools
I knew there had to be an easy solution. Thanks.
  

If you only have one solution to a problem you're not trying hard enough!
Back to top
WWW  
IP Logged
 
Tony Barnett
YaBB Newcomer
*
Offline



Posts: 26
Location: Mansfield, Victoria, Australia

YaBB 2.6.0
Re: Blocking spiderbots from profile pages
Reply #3 - Sep 9th, 2011 at 5:32am
Post Tools
Glad to help!
  

Back to top
IP Logged
 
Dandello
YaBB Administrator
YaBB Next Team
Operations Team
Beta Testers
Support Team
*****
Offline



Posts: 2,383
Location: Earth

YaBB 2.6.1
Re: Blocking spiderbots from profile pages
Reply #4 - Sep 10th, 2011 at 4:23am
Post Tools
Just adding this so other people can benefit. This is part of my newest robots.txt
Code
Select All
User-agent: *
Disallow: /cgi-bin/yabb2/YaBB.pl?action
 



Disallowing the 'action' should keep things like login and such actions from being spidered. It works when I run my own link-checker/spider emulator on it. Wink
  

If you only have one solution to a problem you're not trying hard enough!
Back to top
WWW  
IP Logged
 
godzaiilat
YaBB Newcomer
*
Offline



Posts: 2
Re: Blocking spiderbots from profile pages
Reply #5 - Dec 15th, 2011 at 9:48am
Post Tools
Dandello wrote on Sep 8th, 2011 at 10:51pm:
In my robots.txt I had:
Code
Select All
User-agent: *
Disallow: /pix/
Disallow: /aspects/pix/
Disallow: /cgi-bin/
 



And this has done an adequate job of keeping spiders out of my scripts. But now I'd like to allow Google and the others to spider the forum but ignore the profile pages - which means I have to allow the cgi-bin to be spidered.

Like any good Perl Monger, I have at least two ways I can see of doing this - figure out how to ban them using robots.txt or put
Code
Select All
<!-- robots content="noindex" --><!-- /robots --> 

around the profile links.

Has anyone been successful at this? Is there already security in place so I don't need to worry?  Huh

And yes - I know that only that good robots actually read things addressed to them and obey.  Wink

Shocked Shocked Shocked
  
Back to top
 
IP Logged
 
BloodyRue
Junior Member
**
Offline



Posts: 83

None
Re: Blocking spiderbots from profile pages
Reply #6 - Dec 16th, 2011 at 11:44pm
Post Tools
If you are looking for a way to ban spider bots that don't pay attention to the robots.txt file I have that installed.

I have a series of traps that catches bad  bots behaving badly and auto-bans them in the .htaccess file, writes a log of their activity, sends me an email and emails their host with an abuse letter.

If they parse directories or parse scripts incorrectly it triggers the ban.

in 10 years of running this on my boards it has banned: at least 1158 bad IP addresses not counting the ones I deleted for a while to cut down on the emails a bit.
  

   
Back to top
IP Logged
 
Dandello
YaBB Administrator
YaBB Next Team
Operations Team
Beta Testers
Support Team
*****
Offline



Posts: 2,383
Location: Earth

YaBB 2.6.1
Re: Blocking spiderbots from profile pages
Reply #7 - Dec 17th, 2011 at 4:53am
Post Tools
@BloodyRue: I'm currently not having any problems - remembering to only allow members to access the profiles pretty well works for what I wanted - but I may contact you later.
  

If you only have one solution to a problem you're not trying hard enough!
Back to top
WWW  
IP Logged
 
godzaiilat
YaBB Newcomer
*
Offline



Posts: 2
Re: Blocking spiderbots from profile pages
Reply #8 - Dec 22nd, 2011 at 1:50pm
Post Tools
A good knowledge and post it.
  
Back to top
 
IP Logged
 
JonB
YaBB Administrator
YaBB Next Team
Operations Team
Beta Testers
Support Team
*****
Offline



Posts: 3,913
Location: Land of the Blazing Sun!

YaBB 2.6.1
Re: Blocking spiderbots from profile pages
Reply #9 - Dec 22nd, 2011 at 2:16pm
Post Tools
I am moving this to Anti-Spam so it sees more eyes.

Cool
  

I find your lack of faith disturbing.
Back to top
IP Logged
 
Page Index Toggle Pages: 1
Topic Tools
 
  « Board Index ‹ Board  ^Top