More Virtual Promote ... Gazette · Webmaster & SEO Tools · Scumware.com · Free Website Templates

SEF

Search Engine Forums
Helping to make the Web - Since 1998
Hyperseek Search Engine
Login Password Forget your password?    Trouble Logging In?
.
Forums Index Active Topics New Topics My Topics Search My Profile Register Inbox   Rules & TOS
.
 
Forum Index · Search Engine Forums · SEF Community & Networking · Professionals Corner · Blocking Bots
 
Add to hotlist
Reply to this thread Create a New Topic in this forum
Mark This Forum Read
Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend  
Moderator(s): yellowwing
Previous Topic Next Topic
Member Message

elbst23pitt
Joined: Mar 14, 2005
# Posts: 2

View the profile for elbst23pitt Send elbst23pitt a private message

Posted: 03/14/2005 11:58 pm
Edit Message Delete Message Reply to this message

I recently got this message from my dedicated web host:

You'll find that Google puts about 300 hits a week in your logfile, Yahoo puts about 32,000 hits, and MSN puts about 120,000 hits on it. I'm using the samples below that I place in /tmp/yaho

Basically, the major engines are sending bots to review my sites and it is causing my internal hit tracker to be way off.

I want to stop these bots from coming into my sites, but will that negatively affect my search engine rankings? is there anyway for them to stop coming and have my rankings not be affected?

finally, how can i allow them to index my site without my internal tracker counting it as a visit.



g1smd
Moderator
Joined: Jul 28, 2002
# Posts: 10058

View the profile for g1smd Send g1smd a private message

Posted: 03/15/2005 01:58 pm
Edit Message Delete Message Reply to this message

If the bots can't visit your site then they will not index it.

If they don't index it, then they will not include you in their search results.


You should ban other bots that are email scrapers and any others with malicious intent.



elbst23pitt
Joined: Mar 14, 2005
# Posts: 2

View the profile for elbst23pitt Send elbst23pitt a private message

Posted: 03/16/2005 09:11 pm
Edit Message Delete Message Reply to this message


how do I block bots with malicious intent?



g1smd
Moderator
Joined: Jul 28, 2002
# Posts: 10058

View the profile for g1smd Send g1smd a private message

Posted: 03/18/2005 12:07 pm
Edit Message Delete Message Reply to this message

You need to add their user-agent and your suggested permissions to your robots.txt file in the root web folder of your site.



yellowwing
Moderator
Joined: May 21, 2002
# Posts: 2524

View the profile for yellowwing Send yellowwing a private message

Posted: 03/20/2005 09:37 am
Edit Message Delete Message Reply to this message

Isn't there some kind of server code to indicate that the page content has not changed since the last visit?

That would cut down on the robot band width.



yellowwing
Moderator
Joined: May 21, 2002
# Posts: 2524

View the profile for yellowwing Send yellowwing a private message

Posted: 03/20/2005 09:56 am
Edit Message Delete Message Reply to this message

I found this in the W3.org site.
"304 Not Modified
If the client has performed a conditional GET request and access is allowed, but the document has not been modified, the server SHOULD respond with this status code"

Can you ask your hosting company to implement this?




g1smd
Moderator
Joined: Jul 28, 2002
# Posts: 10058

View the profile for g1smd Send g1smd a private message

Posted: 03/20/2005 09:58 am
Edit Message Delete Message Reply to this message



"If Modified Since...."



Dinkar
Moderator
Joined: Aug 12, 2001
# Posts: 4268

View the profile for Dinkar Send Dinkar a private message

Posted: 03/20/2005 10:40 am
Edit Message Delete Message Reply to this message

If Yahoo and MSN are hitting too much then you can slow down them by using 'crawl-delay' in robots.txt

Example:


Code: [copy]





This will tell MSN to wait for 10 seconds before quering for next document.




Dinkar
Moderator
Joined: Aug 12, 2001
# Posts: 4268

View the profile for Dinkar Send Dinkar a private message

Posted: 03/20/2005 10:49 am
Edit Message Delete Message Reply to this message

how do I block bots with malicious intent?


You have to use .htaccess file. I don't know much about it but have the following code:



Code: [copy]





Add the code in your .htaccess file and replace {ADD USER AGENT HERE} with the name of malicious user agent name. You need to repeat the code for every user agent.

Examples:

SetEnvIfNoCase User-Agent "indy library" keep_out
SetEnvIfNoCase User-Agent "missigua locator" keep_out
SetEnvIfNoCase User-Agent "FndLnk" keep_out



[ Message was edited by: Dinkar 03/20/2005 08:21 pm ]





jsrobinson
Joined: Dec 18, 2004
# Posts: 29

View the profile for jsrobinson Send jsrobinson a private message

Posted: 03/21/2005 05:01 pm
Edit Message Delete Message Reply to this message

I think the problem needs to be looked at from a different perspective: why isn't the web log reporting tool taking the bots into account and removing their hits from the usage stats?

I specifically rewrote a significant portion of my web reporting tool specifically to do this, because I did not want to "limit" SE's access to sites I host/run. User-Agent is easily found in logs, and easily accessable from code (PHP/ASP) so this really should not be a huge technological issue for anyone (but then again, I don't know your situation...).


 
Forum Index · Search Engine Forums · SEF Community & Networking · Professionals Corner · Blocking Bots
Who's Online?
Reflects user activity within the last 5 minutes
Previous Topic Next Topic
You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
  1. You have not yet logged in, or registered properly as a member
  2. You are a member, but no longer have posting rights.
  3. This is a private forum, for which you do not have permissions.

If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

If you cannot find this message, click here to Re-Send it.

If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread



Related Forum Topics
  1. Do bots register pageviews? (In: General Search Engine Optimization)



© 1995 - 2006  ·  iWeb, Inc  ·  DBA JimWorld Productions