Posted By: david68 ()
Posted On: 08/07/2007 06:51 am
|
I want better rankings and more traffic, but what bots actually help me and aren't just out there for data mining to sell information?
If they ignore robots.txt (everyone is disallowed until I allow them) I block them. I got a new bot, hit robots.txt, hasn't viewed a page so it's obeying it, but I googled about it and some people call it a data miner. It's called BlogPulseLive - hostname comes back as intelliseek dot com which redirects to nielsenbuzzmetrics dot com which openly admits it's a data miner company. So I assume it doesn't help me at all?
Is there a list of good bots out there? Some blog bots help get traffic.
|
|
Posted By: Hampstead ()
Posted On: 08/07/2007 09:09 am
|
There is no easy way of banning data mining or "bad" bots. They simply won't adhere to robots txt or tag. Indeed, they could simply identify themselves as browsers.
If you're really worried about it, you could try to find a list of IP adresses asociated with the various bots and ban them, but there is very little point.
By the way, all bots by definition are there for data mining purposes.
|
|
Posted By: david68 ()
Posted On: 08/07/2007 09:20 am
|
By the way, all bots by definition are there for data mining purposes
Yes, but google/yahoo/msn at least help the cause in the process.
The question was: is there a list of GOOD bots, that I could specially "allow" in robot.txt. I guess I should have phrased it better.
|
|
Posted By: Hampstead ()
Posted On: 08/07/2007 10:15 am
|
It is quite normal to allow all bots.
|
|
Posted By: david68 ()
Posted On: 08/07/2007 10:32 am
|
Well, maybe I'm quite abnormal as I rather not give permission to companies who sell MY information without permission without me gaining anything in return.
|
|
Posted By: g1smd (Moderator)
Posted On: 08/07/2007 03:44 pm
|
A Google search will find a number of sites that have compiled a list of what they consider to be "bad bots". I would use those as a start, making sure to review as many as possible to ensure that the list is still correct.
|
|
Posted By: Hampstead ()
Posted On: 08/07/2007 11:05 pm
|
I understand your problem with these companies, but the "bad bots" will take no notice of your robots.txt or your robots noindex tag and will simply crawl your site anyway.
|
|
Posted By: david68 ()
Posted On: 08/08/2007 05:36 am
|
Hmph. I don't care about "bad bots" - I asked for a list of "GOOD" bots Bots which OBEY robots.txt that I should allow. BLOGPULSELIVE keeps asking for permission, I was wondering if I should allow them or if they won't really help me get rankings. I did google, but honestly the search engines are getting pretty crappy - bad results or no results.
I know most people allow everyone, but honestly why waste my bandwidth for something that won't benifit me.
|
|
Posted By: dudibob ()
Posted On: 08/08/2007 06:08 am
|
banning bad bots will require a fair bit of server scripting of banning IPs to make sure they don't come in and new bad bots are generated every day so you will have to update your script a fair bit :s
Personally I'd say don't worry about them, just look out for MSNbot, Slurp (yahoo), Googlebot and Asks (I forget what it's called).
|
|
Posted By: david68 ()
Posted On: 08/08/2007 06:53 am
|
OMG - forgive the rant/flame - but I never asked about BAD BOTS.
I asked: Is BlogPulseLive a GOOD bot?
I asked: What ones are considered GOOD (like Slurp, Google, etc) that I should specifically allow in robots.txt.
Sheesh.
Moderator - please LOCK this topic Better yet, move it to the xVault.
|
|
Posted By: Hampstead ()
Posted On: 08/08/2007 07:59 am
|
Allow all bots by not using the robots.txt. The "good bots" will then have access to your site without you having to worry about it.
It's quite normal to do this.
|
|
Posted By: david68 ()
Posted On: 08/08/2007 08:06 am
|
Thanks for your comments - even though I don't agree with them nor were they useful.
|
|
Posted By: Hampstead ()
Posted On: 08/08/2007 08:47 am
|
|
|
|