Dynamic Php rewrite problems

Posted By: jshites ()
Posted On: 2006-Mar-13 14:39

Hello all,

We have a five year old directory site. Over a year ago, we had all our dynamic PHP pages rewritten to show something like = www.url.com/category/xxx instead of www.url.com/category/xxx?PHPSESSID=3456abc567

Currently, our programmers can't seem to give me the answers to the following:

1.) Even after the rewrites were completed, some categories from the home page still yield the PHPSESSID= extensions when viewing in w/ Explorer... Please, someone tell me why, and what to do to make sure they only appear as www.url.com/category/xxx ? I don't get it. Sometimes they show properly and other times they don't?

2.) Any category listed w/ more than one word, currently has an "% sign" in between the two words of the category:
[link]
What needs to be done to have an underscore "_" appear instead of the "%"..?
Something like:
[link]

Any help would be greatly appreciated...









Posted By: g1smd (Staff)
Posted On: 2006-Mar-13 20:24

Do not use spaces or underscores in URLs.

Use only hyphens or dots to separate words.


If accessing a URL that contains a session ID still returns content and a "200" status then that URL is still valid and will still be indexed.

You need that URL to return a page with <meta name="robots" content="noindex"> included in it, or else a "301" or "404" status.


If any of the pages that are still listed are shown as "Supplemental Results" then Google will take a very long time to update their data. Make sure that the server gives the correct HTTP and HTML response and then ignore it. Google will update it at some time in the (very distant) future.


Posted By: Prowler (Staff)
Posted On: 2006-Mar-14 06:34

>>1.) Even after the rewrites were completed, some categories from the home page still yield the PHPSESSID= extensions when viewing in w/ Explorer...

Unless we see the actual site, we can't tell offhand how your scripts work. But here is the general view: It is normal to see session ids when you browse through your pages in a particular order through a typical IE browser. The correct way to check if the session ids will impact the search engine robots is to mimic a robot and crawl your pages.

Check your server log files to see actual robots crawling your pages. If they seem to choke on certain pages and do not crawl past a stage, then you know where to start probing.

>>2.) Any category listed w/ more than one word, currently has an "% sign" in between the two words of the category

A simple line of code can eliminate this anomaly. Substitute "%20" with a "-". This will replace all spaces (in between words) with a hyphen. But this alone is not a major issue with search engines. It is how your scripts are designed in the first place.




Posted By: dirty_shame ()
Posted On: 2006-Mar-14 10:25

Something to check for: See if your programmers have used urlencode in their link scripting which will automatically replace all HTML entities with exactly what you have reported here.


Posted By: lizardz ()
Posted On: 2006-Mar-15 02:08

<<< Currently, our programmers can't seem to give me the answers to the following: >>>

if they can't answer those questions it's time to get new programmers. If I were you I'd be worrying about my site.

The session ids may or may not be caused by incorrect session handling, hard to say without seeing the code.

If you are paying these people then you should give some serious thought to hiring somebody who is competent instead, that's a better use of your money.

[ Message was edited by: lizardz 03/14/2006 10:03 pm ]