dpeddle
Joined: Eons Ago
# Posts: 269
|
Posted: 2004-Jun-21 21:05
I have a section of tv based news.... (numerous interviews/specials on the company in windows media format) .
You click on the interview you want to see and you go the file and watch an embedded wmv file.
I do not want these pages indexed because there is no real content on the page... just a 2 sentence brief on the interview topic.
If i want to allow viitors access to these pages/directory from my press section & sitemap..... but i don't want robots to index the files..... is it possible to achieve this by using the robots.txt
file.
I heard somewhere that you can deny access to files in the robots.txt file..... but if you link to them directly on your site.... they may get picked up anyway.
|
 |
bhartzer
Staff
Joined: Jun 08, 2000
# Posts: 7042
|
Posted: 2004-Jun-21 21:14
I would definitely deny access to robots via the robots.txt file.
With htaccess you can password-protect the pages, but I'm not sure that's what you want to do?
|
 |
dpeddle
Joined: Eons Ago
# Posts: 269
|
Posted: 2004-Jun-21 21:34
i don't need to password protect the pages.....
I just want to make sure that
a) my users can watch the news
b) the search engine robots do not index the pages
My concern comes from a post i read a while back claiming that if a page is linked to from a sitemap or internal page.... but is blocked in the robots.txt file .... the robots will still index/analyze the page (which to me kind of defeats the purpose of having a robots.txt file)
|
 |
dpeddle
Joined: Eons Ago
# Posts: 269
|
Posted: 2004-Jun-21 21:55
Also......
Not sure if this is possible.... but i would like to find a way to block access to my robots.txt file from regular users.
There are directories in there that i don't want my competitors to see..... but again... not sure if thats possible...i know i can password protect the directories.... but i just don;t want my competitors knowing how i am handling robots.
|
 |
bhartzer
Staff
Joined: Jun 08, 2000
# Posts: 7042
|
Posted: 2004-Jun-21 21:58
As far as I know, the only way to block your robots.txt from competitors would be to cloak it, and you'd really have to use IP detection in order for it to be effective.
|
 |
dpeddle
Joined: Eons Ago
# Posts: 269
|
Posted: 2004-Jun-22 14:54
ok... thanks for the help....
|
 |
yellowwing
Joined: May 21, 2002
# Posts: 2526
|
Posted: 2004-Jun-22 15:20
Is band width is a problem? A few additional sentences on a topic overview, or a transcript if your really ambitious, would be great content paired up with the WMV files.
I'd love to be able to search for video clips of interviews and such. Where can I find John Lennon's last taped interview? When and where did Clinton say, "I feel your pain..."?
|
 |
dpeddle
Joined: Eons Ago
# Posts: 269
|
Posted: 2004-Jun-23 18:00
Its looking more like the transcript is the way to go..... we are going to see if we can have them sent over from the various media outlets.... our pr firm should be able to handle that..... otherwise..... i may just write out an article outlining the key points in each feature.....
|
 |
g1smd
Staff
Joined: Jul 28, 2002
# Posts: 10418
|
Posted: 2004-Jun-26 20:13
If you have stuff that you don't want spidered, and you don't want real people sniffing around too, then you could do this:
List the folders that you don't want spiders and people looking in in the robots.txt file. In each of the private folders make sure you have an index.html file that is completely blank. That will stop people from directly seeing a list of the files in that folder.
|
 |