Page Index Toggle Pages: 1
Topic Tools
Normal Topic "Sparse" output for search engine crawlers? (Read 2,090 times)
cepheid
Senior Member
****
Offline



Posts: 516
Re: "Sparse" output for search engine crawlers?
Reply #5 - Jun 15th, 2009 at 9:14am
Post Tools
deti wrote on Jun 15th, 2009 at 9:08am:
There is so much JS inside YaBB that this Button would not make a difference.

From what I've seen, there's not that much JS, and the JS that is there is primarily used to enhance the user experience (e.g. advanced editing features in the Reply box, automatically selecting certain checkboxes, etc.) rather than to provide basic functionality.  Using JS for enhancing features is totally fine, because those without JS can still use every feature, just not necessarily to their fullest extent.  On the other hand, using JS to provide basic functionality (e.g. email) is, IMHO, not a good thing because it cripples functionality for those users who don't enable JS.

That's why I think the JS should be limited... it's totally fine to use it as an enhancement, but I don't think that basic functionality (like email) should rely on it.

deti wrote on Jun 15th, 2009 at 9:08am:
I don't know anyone who doesn't use JS.

I actually know plenty of people who don't use JS, especially those who run Windows (because of various security exploits through JS).  They enable it manually when it's absolutely necessary, but prefer to leave it off.

deti wrote on Jun 15th, 2009 at 9:08am:
If you make me a list I will remove them one by one.

Will do.  Look for it soon.  Although... I don't think the links should be turned off for human guests, because getting the error encourages them to register.  The links should only be turned off for search engines, IMHO.
« Last Edit: Jun 15th, 2009 at 9:17am by cepheid »  
Back to top
WWW  
IP Logged
 
deti
Legacy Dev Team
Development Team
****
Offline



Posts: 2,650
Location: Prien am Chiemsee, Germany
Re: "Sparse" output for search engine crawlers?
Reply #4 - Jun 15th, 2009 at 9:08am
Post Tools
Hmmm, I can not share your concerns about JS. There is so much JS inside YaBB that this Button would not make a difference. And to be honest, I don't know anyone who doesn't use JS. Wink


cepheid wrote on Jun 15th, 2009 at 8:23am:
Any link that takes you to a user's profile, i.e. action=viewprofile.

If you make me a list I will remove them one by one.
  

Was immer Du tun kannst
oder erträumst tun zu können,
beginne es.
Kühnheit besitzt Genie,
Macht und magische Kraft.
Beginne es jetzt.
Whatever you can do
or dream you can,
begin it.
Boldness has genius,
power and magic in it.
Begin it now.
J. W. Goethe
Back to top
WWW  
IP Logged
 
cepheid
Senior Member
****
Offline



Posts: 516
Re: "Sparse" output for search engine crawlers?
Reply #3 - Jun 15th, 2009 at 8:23am
Post Tools
deti wrote on Jun 15th, 2009 at 7:58am:
For the print-page we could make a JS link like: <a href="javascript:NoFollow('action=print');"

I would prefer to avoid JS whenever possible, because some people browse without JS, and we should not limit features only to people using JS if at all possible.  A printer-friendly page should be available to people even if they don't use JS.

By the same token, I think that the email links should also not be JS, but regular links... since even people without JS may want to email a user.

Rather than using JS more often, I think it would be better to sparsify the output for search engines, because that maximizes user-friendliness while also maximizing SEO.  It's more work, but the end result is better, IMHO.

deti wrote on Jun 15th, 2009 at 7:58am:
About what profile links are you talking?

Any link that takes you to a user's profile, i.e. action=viewprofile.  Those are accessible from pretty much every page, be it a Board listing, a thread, search results, etc.  Those links are available even for guests, although guests then get an error "Sorry, this service is for registered members only!"  This error is useful for human guests (it encourages them to register), but not for search engines.
« Last Edit: Jun 15th, 2009 at 8:24am by cepheid »  
Back to top
WWW  
IP Logged
 
deti
Legacy Dev Team
Development Team
****
Offline



Posts: 2,650
Location: Prien am Chiemsee, Germany
Re: "Sparse" output for search engine crawlers?
Reply #2 - Jun 15th, 2009 at 7:58am
Post Tools
cepheid wrote on Jun 15th, 2009 at 6:34am:
Similarly, none of the "email this member" links should be followable by a search engine, because they're just not useful for the search engine.  Profile links shouldn't be visible, either, since Guests aren't allowed to view profiles - the error message is useful for real humans, but not for search engines.

Email links are JS to avoid search engines to collect email address.

For the print-page we could make a JS link like: <a href="javascript:NoFollow('action=print');"

About what profile links are you talking? I found only those after the By: .... They should be be without link for Guests. Thanks for advice!
  

Was immer Du tun kannst
oder erträumst tun zu können,
beginne es.
Kühnheit besitzt Genie,
Macht und magische Kraft.
Beginne es jetzt.
Whatever you can do
or dream you can,
begin it.
Boldness has genius,
power and magic in it.
Begin it now.
J. W. Goethe
Back to top
WWW  
IP Logged
 
LoneWebSurfer
Past Team Members
Offline



Posts: 1,279
Re: "Sparse" output for search engine crawlers?
Reply #1 - Jun 15th, 2009 at 7:23am
Post Tools
Any thing that helps with SEO is a good thing.
  

Closed all my sites due to lack of Internet access
Back to top
WWW  
IP Logged
 
cepheid
Senior Member
****
Offline



Posts: 516
"Sparse" output for search engine crawlers?
Jun 15th, 2009 at 6:34am
Post Tools
YaBB's normal output is quite rich, because it is optimized for user-friendliness and interaction.  Search engine crawlers currently see this rich output.  However, a lot of the output is not useful to search engines and therefore ends up "wasting" bandwidth, as well as cluttering search results on those search engines.  As an example, most search engines follow and index the action=print link for each page, but this merely duplicates the information that has already been downloaded... also, a user searching for results on an engine (e.g. Google) probably doesn't want to see the text-only printer-friendly output, but rather the rich output that a normal user would see, yet both of these will be shown as results.

Similarly, none of the "email this member" links should be followable by a search engine, because they're just not useful for the search engine.  Profile links shouldn't be visible, either, since Guests aren't allowed to view profiles - the error message is useful for real humans, but not for search engines.

Therefore, I think it would be useful to have a "sparse output" mode for search engine crawlers, so they won't have to see all the extra output that would otherwise clutter the search results and eat up bandwidth for nothing.  The sparse output would include all the regular thread text and images, but would omit any links that are "useless" for an engine, e.g. action=print, action=mailto, etc.

Yes, it's certainly possible to put rel="nofollow" in all of these links, but that actually doesn't do the same thing, because many search engines just ignore that (including Yahoo, Ask, and other major engines).  Instead, it's better to just omit them when the page is displayed to a search engine.

(I actually have a robots.txt that excludes all action=* links from being followed, but again, many search engines don't obey this.)

Thoughts?
  
Back to top
WWW  
IP Logged
 
Page Index Toggle Pages: 1
Topic Tools
 
  « Board Index ‹ Board  ^Top