how to create a robots file?

Posted By: moverworldwide ()
Posted On: 2006-Mar-25 06:14

hello, sorry for my ignorance. i have my site up on google and yahoo. recently i checked with google sitemap and google says there is no robots.txt file. i do not know what is this stuff. i read through many forums QA but no where there is a guideline for a beginer.

can anyone help me how should i create a robots.txt file? i understood from forums to create one in note pad and name it robots.txt. but what is the stuff that i have to put in for all search engines to crawl all my pages?

Do i have to register with any robots company or just crate a robots.txt file and upload to the root of my site??

how can i check whether my robots.txt file created is working??
tnks for any help

rolleyeys


Posted By: Prowler (Staff)
Posted On: 2006-Mar-27 14:35

Robots.txt is a text file in the root - containing directives to robots. Specifically the directives indicate to the robots which directories/files are disallowed. All mainstream search engine robots obey the directives while many rogue robots pick out hidden directories from this file.

You don't need a robots.txt or you may have a blank robots.txt which indicates that the entire site can be crawled at will.

The following is a typical robots.txt :


Code: [copy]





Here is one online validator:
[link]

More detailed information about the robots.txt than we have the space here (for the terminally impatient readers) head to this link:
[link]


Posted By: iann2006 ()
Posted On: 2006-Apr-03 05:16

The robots.txt file is a set of instructions for visiting robots (spiders) that index the content of your web site pages. The file must reside in the root directory of your web. For those spiders that obey the file, it provides a map for what they can, and cannot index.

To exclude all robots from the server (do not use this one unless you want no indexing for the entire site!):

User-agent: *
Disallow: /

To exclude all robots from parts of a server:

User-agent: *
Disallow: /private/
Disallow: /images-saved/
Disallow: /images-working/

To exclude a single robot from the server:

User-agent: Named Bot
Disallow: /

To exclude a single robot from parts of a server:

User-agent: Named Bot
Disallow: /private/
Disallow: /images-saved/
Disallow: /images-working/

Note: The asterisk (*) or wildcard in the User-agent field is a special value meaning "any robot" and therefore is the only one needed until you fully understand how to set up different User-agents.

If you want to Disallow: a particular file within the directory, your Disallow: line might look like this one:

Disallow: /private/top-secret-stuff.htm

wink