JimWorld Forums: How far does a spider crawl down a page?



Posted By: saschaeh ()
Posted On: 05/12/2005 03:00 am

I have generated a site map that pulls out spider friendly links to all the dynamic content in my DB. I have three categories with in each city and there are four cities. I have only pulled out one category of one city and it is already fairly long.

How many full pages does a spider roughly go down. I read somewhere that spiders don’t always go all the way and that weighting is also lost the lower they go.

1: If this is true how far would you think that is?
2: I have made it user friendly and broken it down in a hierarchical fashion. Will i get penalized for this kind of thing?


Thank you for your time!



Posted By: g1smd (Staff)
Posted On: 05/12/2005 05:03 am

Google caches only the first 100kB of a page. For a short while, a few months ago, they did cache up to 250kB but are no longer doing so for new pages. The figure has reverted to 100kB again. For larger caches generated a few months back they are still retaining the larger page size for those only.

How much is actually spidered and indexed is left for you to discover, simply by putting some unique "words" on your page and seeing which are indexed and which are ignored.


Posted By: Vinnie ()
Posted On: 05/13/2005 07:59 am

I may be have misunderstood what you are saying, but if I have not here is my take on it.
Now this is coming from a brand new site released at the begining of last week with some real good inbound quality links to it. Google indexed the entire front page of 322 words. But having said that the entire site is built in XHTML and div tags, no tables no interuptions at all, clean stylish, optimised images, and readable.
Now I am wondering if there had been tables etc if it would have done the same! I'm just waiting now until it makes it's damn mind up to get to the other pages in its own good time.


Posted By: saschaeh ()
Posted On: 05/13/2005 12:29 pm


Thanks! Ill def use your guys advice!

Where did you find out about the 100K thing?
Awesome tip on putting unique words at different levels of your page.


All the best things!
Sascha tongue


Posted By: g1smd (Staff)
Posted On: 05/13/2005 01:12 pm



The 100KB limit has been mentioned in many forums, and might also be found in the Google Guidelines for Webmasters, somewhere on their site.

They did cache up to 250KB for a short while earlier in the year. The evidence was seen in the SERPs. They now only cache up to 100KB for new pages (as I found out when I briefly tested with a 120KB file last month).


JimWorld Forums © 1996 - 2004 .... iWeb Technology, Jimworld.com