More Virtual Promote ... Gazette · Webmaster & SEO Tools · Scumware.com · Free Website Templates

SEF

Search Engine Forums
Helping to make the Web - Since 1998
Hyperseek Search Engine
Login Password Forget your password?    Trouble Logging In?
.
Forums Index Active Topics New Topics My Topics Search My Profile Register Inbox   Rules & TOS
.
 
Forum Index · Search Engine Forums · SEF Community & Networking · Professionals Corner · Robots Exclusion List
 
Add to hotlist
Reply to this thread Create a New Topic in this forum
Mark This Forum Read
Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend  
Moderator(s): yellowwing
Previous Topic Next Topic
Member Message

harjitsingh
Joined: Oct 21, 2005
# Posts: 14

View the profile for harjitsingh Send harjitsingh a private message

Posted: 10/21/2005 11:16 pm
Edit Message Delete Message Reply to this message

Hello there,

In Robots.txt which is generally uploaded to root folder www.mydomain.com/robots.txt
How can one exclude certain directories(/images or /admin etc.) and not let anybody see which directories are excluded when reading robots.txt file. It might sound funny, but I wanted to know , whether some other methods exist or not.

Thanks
HarRy



lizardz
Joined: Nov 12, 2004
# Posts: 1394

View the profile for lizardz Send lizardz a private message

Posted: 10/22/2005 03:42 pm
Edit Message Delete Message Reply to this message

Simple, you can't.

It's a text file, it sits there, search bots request it, they read it.

Theoretically, you could generate it dynamically based on a check of the requesting ip range or something, then only serve up the valid version to search bots, but there's no guarantee your pages wouldn't get spidered in that case since if the bot come in off another ip range you didn't have listed, it would get the robots.txt without the blocks.

So the practical answer is, if you don't want anyone to be able to see a blocked part of your site, just don't let them in, don't link to it from the main site, that's how I do it when I don't want a part of my site indexed at all.



harjitsingh
Joined: Oct 21, 2005
# Posts: 14

View the profile for harjitsingh Send harjitsingh a private message

Posted: 10/23/2005 09:55 pm
Edit Message Delete Message Reply to this message

Thank you for your suggestion.

Since the site is dynamic one and maintained by CMS, is there a possibility that robots will trace it looking at the backlinks to the .htm or .php pages.

Thanks HarRy



lizardz
Joined: Nov 12, 2004
# Posts: 1394

View the profile for lizardz Send lizardz a private message

Posted: 10/24/2005 12:44 pm
Edit Message Delete Message Reply to this message

" is there a possibility that robots will trace it looking at the backlinks to the .htm or .php pages"

If I understood this question I might be able to answer it. However, in general, if you can't do programming, and you are running a cms, then what you get is what you get, you can't change it. If you can do programming, and can change components, then you can get anything you want, within reason of course.

Robots will follow any link to any page not blocked in robots.txt, so if a link exists and is not blocked, then the robot will at some point follow it.



harjitsingh
Joined: Oct 21, 2005
# Posts: 14

View the profile for harjitsingh Send harjitsingh a private message

Posted: 10/26/2005 02:10 am
Edit Message Delete Message Reply to this message

I was concerned about the exclusion list because only homepage www.mydomain.com was cached and not the inside pages, which are linked to it.

I have index,follow for the robots meta tag, but still inside pages are not getting crawled or cached.

also when you check for links to the website, it should show the inside pages, but it's not showing it.

can I get some help /guidance on this

thanks
harRy



Logan
Moderator
Joined: Aug 14, 2002
# Posts: 3749

View the profile for Logan Send Logan a private message

Posted: 11/07/2005 06:15 am
Edit Message Delete Message Reply to this message

Hi harRy, I don't think the robots.txt is a factor based on your comments. There are many other reason internal pages may not being indexed. The two most common I can think of are ..

1) Lack of link/popularity to the url
2) A url with multiple parameters (i.e. mypage.php?x=1&name=product&category=1234&anotherparameter=sfruokcn

Tough to say without reviewing, can you referenc the site w/i your profile for those interested in helping?



harjitsingh
Joined: Oct 21, 2005
# Posts: 14

View the profile for harjitsingh Send harjitsingh a private message

Posted: 11/08/2005 04:54 am
Edit Message Delete Message Reply to this message

Here is the website I am talking about
((url removed--put in profile only))

[ Message was edited by: bhartzer 11/25/2005 01:09 pm ]




 
Forum Index · Search Engine Forums · SEF Community & Networking · Professionals Corner · Robots Exclusion List
Who's Online?
Reflects user activity within the last 5 minutes
Previous Topic Next Topic
You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
  1. You have not yet logged in, or registered properly as a member
  2. You are a member, but no longer have posting rights.
  3. This is a private forum, for which you do not have permissions.

If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

If you cannot find this message, click here to Re-Send it.

If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread



Related Forum Topics



© 1995 - 2006  ·  iWeb, Inc  ·  DBA JimWorld Productions