Printer Friendly Version
Email this thread to a friend
|
Featured Web Site Template |
|
Reflects user activity within the last 5 minutes
|
|
| Member |
Message |
highman
Joined: Oct 16, 2000
# Posts: 559
|
Posted: 2001-Jun-08 07:44
This is not a witchhunt....periodWhat Bruce was / is doing, is for him to live with, ethical or not. Bruce has said it was a programming error...that it may be, I am not technical enough to decipher talk of modules and screen scraping. We are in an industry only just holding an appearence of being legitimate, there is an awful lot of people out there who view our industry as some sort of black art, the first site in this thread was the worst of the worst kind of page jacking, it was stopped....quickly, that is good for our industry. The second (bruce) was possibly due to a programming error (as I said I do not know enough to decipher the technical stuff), in my opinion, it damages the image of our industry, because it involves such a 'percieved authority'. It throws the view of our industry right back into the black arts area of shady deals, fast re-directs and disguised content, forget the fact that there was borrowed content, it all comes down to one word....deception. Personaly, Bruce, I do not need apologies, it was not my content, if you want to use such techniques, that is up to you...those who live by the sword....it does concern me when other peoples content is borrowed.
My main concern, as I have said, is the damage to the industry we are working so hard to promote as an effective and legitimate marketing opportunity. I hope everyone who reads this thread 'takes on board' how much damage can be done so easily to the industry. On the bright side hopefully we have a new regular member in the forums.....  Mike
|
 |
detlev
Joined: Apr 15, 1999
# Posts: 48
|
Posted: 2001-Jun-08 10:51
I will try to boil down the technical stuff so more can understand it.The whole thing is automatic so he hit: go! and walked away probably only looking that it was doing it's thing. The program then went on its way and may have generated several hundread or thousand pages in a matter of minutes or hours. The program is written in PERL. It screen scrapes a SERP (Search Engine Results Page) like any positionchecker and then fetches the placed pages from the SERP. It records the content from the placed sites and creates a new html document then it publishes the html in Bruce's /Promotion/ directory. PERL operators for strings (text): "eq" means: equal to "ne" means: not equal to To look for content, Bruce instructed his program to fetch placed pages if the domain name in the link is equal to bruceclay.com - or so he thought! Out of habit writing so much code, he used the wrong operator "ne" by mistake instructing the spider to fetch placed pages NOT EQUAL TO bruceclay.com and was his "programming error". Similar mistakes with these two operators are described in Programming PERL, 2nd Edition, page 528 under the heading: Universal Blunders Bruce's "programming error" did not halt his program and it would compile with this error perfectly. He failed to check the output to avoid what happened. With all these circumstances, some naturally suspect foul play. I won't make that judgement here. There are 3 points to make. The first, Bruce is our colleague and deserves the initial belief that what he says is true. If so, this was a grievous, rookie mistake that is hard to understand why it came from Bruce. Second, if true, it appears he was overconfident in his code writing skill ignoring the rules every PERL code jockey knows: PERL is one of the more dangerous languages to use, affectionately called the "duct tape of the Web" for its better uses. It carries warnings of outcomes exactly of this sort from Larry Wall and other PERL programmers. There was no testing environment other than the public Web. The output was not tested and the results published immediately to the Web. No looking at the html either at the time or for several weeks thereafter. This is hard to understand on something as important as generating new content for the site. It's plausible but a bit outlandish. Three, and important to note, Bruce has contended that code-generated material, on the order of several hundred or even thousands of documents, is not dubious but exceptionally OK, even ethical. Not so. Search Engines seeing randomized sentences can recognize it as machine-generated and they would likely consider it spam if the intent was soley to improve positions. Therefore, the whole experiment, even from his own content, would have been an experiment in ultimately what could be considered spam. This is true wether you cloak or not. If search engines find gibberish, it's usually tossed, and why not? In my judgement, these pages would have been thrown out the moment a Spam Fighter's eyes came across it. Bruce has taken the offending directory down. The next refresh or sooner, the positions will drop away. Despite Bruce's doubt that any scored, some did. #13 at Google for: stefan karzauninkats. You might argue that is an arbitrary query but it certainly isn't for Stefan. There are some more inflammatory samples from colleagues of ours. The important thing is Bruce has apologized and taken appropriate actions. He ought to perhaps quit explaining things actually, a reverse of what some here might want, but it may be better for him personally. Bruce could use a little time out to think about changing attitude and altering processes. Show some leadership by taking responsibility. How else can one salvage themselves in the SEO community? This turn of events should be used in a positive way as an example of what not to do and how far not to go. *cheers* -detlev [This message has been edited by detlev (edited 06-08-2001).]
|
 |
RodB
Joined: Eons Ago
# Posts: 1435
|
Posted: 2001-Jun-08 14:47
Thank you for the explanation Detlev. Once more please keep this thread impersonal. We are all learning something important here.
|
 |
bruceclay
Joined: Jun 06, 2001
# Posts: 12
|
Posted: 2001-Jun-08 14:57
Unless I am way off base, the issue with my error is resolved. But I would like to recap in a way that might help us all… following the three items mentioned in Detlevs post:Items 1 and 2: For the record, the code was tested as it was developed. But I then, in an effort to clean it up and improve performance, I resequenced and combined some of the code in what I thought was a brain-dead simple “this is obviously going to work” manner. I guess I was the brain-dead component there, and when you write as much exceptionally complex code as I write you sometimes forget the simple things can go wrong. Lesson learned. I still offer a great inventory of tools that give accurate results. Item 3: On the topic of ethics… I do well because I try things. I am not a follower. I admit to experimenting, and in knowing firsthand what works and what doesn’t. The jury is still out on a great many of my experiments. Static sites just need to be tuned. My site is pretty well tuned using some specific techniques and I know that this is the single best way to optimize any web site. But not all sites are static. The jury has spoken on this one – we all know this verdict…. Content wins. Dynamic sites require what I call “shadow pages”, a static HTML copy of the dynamic page resolved by the server and browser that is in turn SEO tuned just like any static html page. I install my envelope redirection code and transfer to the original (appropriate) page in the dynamic site. This, I think, is a common practice thus redirection is not a problem as tuned static html is my preferred optimization method. This is code derived from a page in the site, altered for the search engines, and then submitted and ranked – pretty much standard SEO work. Now it is not a quantum leap to understand that a single doorway page is similar in that it is constructed static html that transfers to another page in a site. And in constructing this page the author takes the topics, keywords, and some content from the site. The content cannot be verbatim, so it is often resequenced or paraphrased sentences. Herein lies my test – if a single shadow or doorway is okay, at what point are there too many, and can it effectively be automated? In fact, if doorways are ineffective, is it because there is not enough critical mass (number of pages on a topic) to be thought of as an expert? If there were 40 pages for each keyword, and they are all different, would this be given better ranking than a site with only 5 doorway pages for that keyword? I intend to know. Likewise, the page itself may only need to be proper sentence structure that uses keywords close to each other and at specific frequency to be of minimal value – perhaps hovering at a ranking of the 500th position. But a thousand of them may have a substantial positive impact on ranking. I intend to know that as well. [By the way – this should show there is no reason to grab pages from other than myself or the experiment isn’t controlled.] I know that generating and submitting a thousand pages to an engine would quickly overload the engine with pages and pages of redirection code. I understand the negative aspect of this practice from the vantage point of the engines. So I don’t submit even a single generated page – I let them get found by the spiders. The search engines decide if it should get in or not. But I need to know that there is no benefit to this process before I can dismiss it from my arsenal or I am not representing the best interest of my clients. My tool can generate 40 pages in 5 seconds at 12 density points for any keyword. I obviously can generate thousands of pages per minute. And that is why this tool will never be added to my toolset. But I want to have an answer. If it is ineffective then I retire it. But if I get a jump in rankings because some search engines consider my site so content heavy that it must be better than all others then it is a valid and effective tool clearly sanctioned by the search engines themselves by virtue of their allowing it. And if only some like it, that is a use for a robots.txt file. The ultimate test of good and bad is the existing defacto search engine sanctioning of this process. My clients depend on my ability to fairly win BUT fairly is defined by the rules set by the engines. The day the generated pages don’t work is the day this test is over. Besides, who else has the technology and guts to answer these questions? And who here wouldn’t like to know?
[This message has been edited by bruceclay (edited 06-19-2001).]
|
 |
detlev
Joined: Apr 15, 1999
# Posts: 48
|
Posted: 2001-Jun-08 16:51
Nope. Not testing and not machine-generated code but spam defines what I think is unethical. If I point at something and label it spam, you may also assume I think it is unethical.I stated that the practice of machine-generating code from SERP placed sites is a dubious practice in and of itself. Dubious is a different word, with different meaning. What passes a machine-check and enters the index is not necessarily accepted. It still may be scrutinized and dropped by a Spam-Fighter. I would drop spam, wouldn't anyone? A robots.txt might have been employed for the directory but was not because these pages were intended to be picked up. In testing, one should listen carefully to the results to determine such important factors as human review. That may be what was missing here. If that check would have been made, the "programming error" would have been spotted right away and fixed. With a robots.txt, it would have been a safer (not completely safe) environment in which to play. Either way, the pages didn't really pass a human review test because they were gibberish. Bruce resorted to redirecting these pages so no user would be subjected to them. Hence, a test in what is ultimately spam. Sentences were incomplete being taken from menus and the like. I do not condone machine generated code from SERP's wether it's your own content or not. It's a dubious practice and usually results in gibberish that constitutes spam. Try not to catch yourself ever questioning, ignoring or re-phrasing the Search Engines' philosophical reason for being. -d [This message has been edited by detlev (edited 06-08-2001).]
|
 |
bruceclay
Joined: Jun 06, 2001
# Posts: 12
|
Posted: 2001-Jun-10 17:01
Detlev has cited that his statement is specifically that this is dubious, not that it is unethical. I accept that clarification. Thank you.In giving it some thought, I agree… it is dubious. I agree that it is something that done to excess cannot in any way benefit the search engines. I agree that it is likely to be the target of countermeasures by the search engine spiders since a byproduct of this process would be to generate thousands of pages that are essentially clutter from the search engine point of view. What I do not yet know is whether some engines won’t today grant exceptional ranking to sites with such pages because they are designed to be indistinguishable from human generated pages (at least when the tool is completed). Dubious yet effective? Regardless, futures research would indicate that the benefit will be short lived even if it were to be successful today. Thus it is clearly of dubious value to anybody. I believe that to be a leader in any industry involves aggressive research, pushing the envelope, validating assumptions, and fighting the fight win or lose. A pro basketball player pushes, shoves, elbows, and sometimes fouls, but they are winners because they work just within the rules of the game, are not afraid to take shots, and have the ability to do what others cannot. I have helped so many others through my tools and insight because I have done the research and I have been fighting the fight. I have killed off many techniques because I have tried them, and dismissed them as ineffective. And I publish my knowledge for free, even offering power tools for next to nothing. I know that editing content properly will result in higher rankings. But if I am to continue to be a leader then I need to know the effect of other technologies and techniques instead of assuming that things will not change. And if we have learned anything from SEO history it is that what worked yesterday often will not work today. I apologize for drawing this thread off topic… what about comparing http://www.freelandtech.com/webindex/webindex_methodology_content.jsp to http://www.bruceclay.com/web_rank.htm -- looks like pagejacking to me.
[This message has been edited by bruceclay (edited 06-19-2001).]
|
 |
JuniorHarris
Joined: Dec 18, 2000
# Posts: 1276
|
Posted: 2001-Jun-10 22:53
[This message has been edited by JuniorHarris (edited 09-06-2001).]
|
 |
Kal
Joined: Aug 13, 2000
# Posts: 226
|
Posted: 2001-Jun-11 01:14
Looks like page-jacking to me too Bruce. Given your material is covered by copyright, have you contacted them?
|
 |
MakeMeTop
Joined: Jul 05, 2000
# Posts: 1714
|
Posted: 2001-Jun-11 04:31
I echo JuniorHarris, Bruce. Your willingness to share information has helped many, many people (including me). The fact that this has been rewarded by others page-jacking your work (as appears to have happened in this case) seems to confirm that page-jacking is a pretty despicable way to reward someones' efforts!I assume that for many top sites this sort of thing is a continual source of annoyance. What do you advise sites to do in a case of page-jacking?
|
 |
Jim
Joined: Eons Ago
# Posts: 5442
|
Posted: 2001-Jul-01 06:15
Bruce, kudos for handling this problem publically. It is no less than what I would have expected from you having seen you around the web for so long.This thread is pure ventage SEF. Flexing e-muscle to keep the Net heading in the right direction has been a big part of the value of SEF. Look through some of the really old threads and see how active SEF members have always been in this regard. This thread has a great start on teaching how to strike back at page jackers. It is likely that a great many people will read this thread and not fall into the trap of doing things the "easy" way.
|
 |
I am illiterite
Joined: Jul 11, 2001
# Posts: 82
|
Posted: 2001-Jul-11 07:58
Great thread.I don't know the details of what went on nor do I want to, I am willing to accept the page-jacking apology and explanation that it was a code glitch. But another toally different point that is enveloped inside this thread is the idea that generating thousands of machine generated pages is not spam if a respected member of the SEO industry does it (as opposed to some newbie). This was done for traffic, not research, but traffic. That is not only a dubious tactic, but it is unethical and it is indeed spam. If it looks like spam, smells like spam, and feels like spam, then that is what it is. I have heard others treat newbies hard without any forgiveness when they submitted thousands of spam pages to the SE's, but I see a hidden message in this thread that "might makes right", that a "big gun" of the SEO world can submit a thousand spam pages and it is not called 'spam' but 'reasearch'.
|
 |
Hasenfefer
Joined: Dec 04, 2000
# Posts: 287
|
Posted: 2001-Jul-11 13:59
"I have heard others treat newbies hard without any forgiveness when they submitted thousands of spam pages to the SE's, but I see a hidden message in this thread that "might makes right", that a "big gun" of the SEO world can submit a thousand spam pages and it is not called 'spam' but 'reasearch'."Indeed.
|
 |
gothumpin
Joined: Feb 19, 2001
# Posts: 57
|
Posted: 2001-Jul-11 14:48
I could not agree with I am illiterite and Hasenfefer more. Spam is Spam is Spam. I have as much respect for spam and its creators as I do for lawyers, soap scum and theives. I used to think that you ran things a little differently Bruce. You're no different than any other people that fill the internet up with garbage.
|
 |
You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
- You have not yet logged in, or registered properly as a member
- You are a member, but no longer have posting rights.
- This is a private forum, for which you do not have permissions.
If you are a recent member, it's possible that you simply have not yet confirmed your account. Please
check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions
contained within.
If you cannot find this message, click here to Re-Send it.
|
If you are still experiencing problem, please read the
Login Assistance
Article for some advice on what may be causing your login not to work properly.
|
Switch to Advanced Editor and ...
Create a New Topic
or Reply to this Thread
|
|