Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend
  • Direct links vs Relative Links (In: Google)
  • Featured Web Site Template

    Hundreds More at Free Site Templates.com!

    Web Site Partners
    Sponsored Links
    Jet City Software
     
    Whos Here ?
    Reflects user activity within the last 5 minutes
    Moderator(s): g1smd, Logan
    Forum Index · Search Engine Forums · Optimizing Your Website for the Search Engines · Google · Zombie Cache Pages from Long-Dead Links
    Member Message

    alonk
    Joined: Mar 15, 2006
    # Posts: 17

    View the profile for alonk Send alonk a private message

    Posted: 2006-Mar-15 21:11
    Edit Message Delete Message Reply to this message

    I've been discovering cached pages on Google from pages that I removed from my site one or two years ago...


    I recently did a Remove URL and updated my robots.txt to reflect this - waiting to see if "pending" will ever become "processed."


    Also - very weird - I've found cached pages from a long dead affiliate program:

    www.mysite.com/index.shtml?merchantbob

    or...

    www.mysite.com/index.shtml?merchantjane

    Where Google still has my two year-old index page cached, by when I do a duplicate content check with "www.mysite.com" and "www.mysite.com/index.shtml?merchantbob" it return a 100% duplicate content (even though the current index.shtml page is only 11% the same!)


    My question: Can I rely on Google's Remove URL, my new robot.txt & mod_rewrite 301 code to vanquish these old caches or is there something I'm missing?




    g1smd
    Staff
    Joined: Jul 28, 2002
    # Posts: 10465

    View the profile for g1smd Send g1smd a private message

    Posted: 2006-Mar-15 23:54
    Edit Message Delete Message Reply to this message

    The 301 redirect will get the correct version (www) to be better indexed within a month or so (but do make sure that every page of the site has a unique title and meta description).

    The non-www pages will take a lot longer to drop out. If they are supplemental results, then they will take at least three years to disappear. That is a bug with Google.

    For the pages that no longer exist, the Removal Tool will "hide" the pages for 3 or 6 months and then they will be put back into the index even though they still do not exist. That is another Google Bug.

    If Google ever take an interest in fixing their bugs, then the usage of the 301 redirect and the robots.txt information is exactly what you need on your site to help them fix the problems.



    alonk
    Joined: Mar 15, 2006
    # Posts: 17

    View the profile for alonk Send alonk a private message

    Posted: 2006-Mar-16 00:05
    Edit Message Delete Message Reply to this message

    Thank you for clarifying that... only, does that mean I will keep being penalized for pseudo "duplicate" content like those old dead links or the non-www pages.

    Three years sounds scacy.

    Also, I thought that <title> and <meta tags> were no longer part of Google's ranking algo... I'm hearing a lot of contradictory info.

    How different does each title tag have to be? 3, 4, 5 words... the whole thing?

    (I plan on going through and revamping the site this month.)

    Thanks for your expertise.



    g1smd
    Staff
    Joined: Jul 28, 2002
    # Posts: 10465

    View the profile for g1smd Send g1smd a private message

    Posted: 2006-Mar-16 00:08
    Edit Message Delete Message Reply to this message

    The supplemental results aren't harming you, they are just a pain in the neck to be showing old content in the results. Make sure you have a custom 404 error page to catch any visitors clicking those links. Give them a page of useful navigation to get them on their way to the right place.

    The title tag and meta description are very important. They need to be different on every page.



    alonk
    Joined: Mar 15, 2006
    # Posts: 17

    View the profile for alonk Send alonk a private message

    Posted: 2006-Mar-16 00:15
    Edit Message Delete Message Reply to this message

    "The supplemental results aren't harming you, they are just a pain in the neck to be showing old content in the results. "

    ...good. I can sleep tonight (as soon as I rewrite all my title tags.)

    I was having trouble with my custom redirect.shtml page returning a 200 to Google - of course, my hosting service had no clue and would not help me. Had to revert to a "file not found" to avoid annoying the spiders (and giving the impression that I was automatic redirects to my homesite.)

    Right now, that's the least of my troubles... but thanks for the reminder.



    alonk
    Joined: Mar 15, 2006
    # Posts: 17

    View the profile for alonk Send alonk a private message

    Posted: 2006-Apr-30 14:06
    Edit Message Delete Message Reply to this message

    Amazingly, 6 weeks later and Google has yet to remove the dead links!

    (However, it refuses to index my current index page!)

    Instead of removing www.mysite.com/deadlink?referrer

    google tried to remove:

    www.mysite.com/www.mysite.com/deadlink?referrer



    So there's a cache of a page from Nov 2004 but no cache of my index page from today!

    It's laughable when google says, "dead links will be removed at the next crawl"

    Nov 2004 is a long time ago....





    g1smd
    Staff
    Joined: Jul 28, 2002
    # Posts: 10465

    View the profile for g1smd Send g1smd a private message

    Posted: 2006-Apr-30 21:48
    Edit Message Delete Message Reply to this message

    I see stuff going back to 2004 January all over the place.

    Google has no clue how to fix it. That much is very obvious.


    Yes, a Custom 404 Error Page is a Good Thing. Make it happen.



    alonk
    Joined: Mar 15, 2006
    # Posts: 17

    View the profile for alonk Send alonk a private message

    Posted: 2006-Apr-30 22:06
    Edit Message Delete Message Reply to this message

    Why can't Google teach their bots to do simple reasoning like:

    Googlebot: Let's pull up mysite.com/thislinkhasbeendeadforyears... hmm.. it looks like that link is giving me a 404. hmm... a page that can't be found? Should I repeatedly index it for the next ten years? No! I know! Maybe I should erase the url and cache from my datacenters! I'm a genius!

    ;-)

    Google acts like it has no control over it's own datacenters. I realize that there are billions of pages, but a dead link is a dead link is a 404.





    g1smd
    Staff
    Joined: Jul 28, 2002
    # Posts: 10465

    View the profile for g1smd Send g1smd a private message

    Posted: 2006-Apr-30 22:10
    Edit Message Delete Message Reply to this message

    Yes, but to keep their index "big" they keep the URL as a Supplemental Result, and continue to show a two year old cache for that page.

    Nice work when the company has changed telephone number, address, and prices for all their products...


    Forum Index · Search Engine Forums · Optimizing Your Website for the Search Engines · Google · Zombie Cache Pages from Long-Dead Links
    You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
    1. You have not yet logged in, or registered properly as a member
    2. You are a member, but no longer have posting rights.
    3. This is a private forum, for which you do not have permissions.

    If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

    If you cannot find this message, click here to Re-Send it.

    If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

    Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread

    New posts Forum is locked
    © 1995  ·  iWeb, Inc  ·  DBA JimWorld Productions