Subscribe

Change Text Size

+ + + + +

Google Misses With Both Barrels And Duplicate Content Kills

Google released a massive search algorithm change yesterday in reaction to growing complaints about their search quality. The change leads to a blanket demotion of all pages on a site, in effect, Google admitting that they no longer trusted their algo and are trying a brute force repair. I’ve been complaining about content farms like DemandMedia with their tens of millions of pages for a long time, but I guess I should have been careful what I asked for. First, in Google’s own words:

Many of the changes we make are so subtle that very few people notice them. But in the last day or so we launched a pretty big algorithmic improvement to our ranking—a change that noticeably impacts 11.8% of our queries—and we wanted to let people know what’s going on. This update is designed to reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.

They also claimed that the new results correlated highly with what they have seen from the one week trial of the Chrome extension that allows users to block content farms, although that data wasn’t actually used to make the changes.

While the algo change has done a decent job depressing the results of some content farms (not all, and not eHow), it has also knocked the stuffing out of genuine content driven websites, like book publishers, if their content has been scraped and copied all over the place. Based on yesterday’s data, it looks to me like this website (FonerBooks) and my older Daileyint.com site have taken about 35% hit to Google traffic, and that is spread across every page and search result, not limited to a particular subject. So it’s a site wide penalty for having been the victim of years of scraping, plagiarism, and barely disguised rewrites. It’s particularly frustrating to see pages that link my site as the sole source of their information now appearing first in search results.

While I feel like the guy who was walking across the street when Google dropped a 1000 pound bomb to take out a cockroach, I’m sure that there will be less garbage in the search results now. It will also be harder for people to find true resources, because the endless stealing and reworking of those resources has apparently it impossible for Google to tell who is the original source. And unfortunately, it appears that Google didn’t even kill the cockroach as eHow is rising in some searches!

I believe that a site-wide variation of the duplicate content penalty is the issue because I’ve seen it before over the years, albeit on a page-by-page basis. It’s nothing Google would ever admit to, but it’s something for which I took an infringer to Federal Court, in the process, winning back the search position of what was then my most popular page when the duplicate content was removed. As Google modestly says:

We can’t make a major improvement without affecting rankings for many sites. It has to be that some sites will go up and some will go down

As long as it really is an improvement, I suppose I should be proud to take one for the team. In my case, it looks like the average drop in the rankings for any phrase is three or four slots. Of course, if they phased the change in through the course of Thursday, it could turn out to be worse. If you’ve got a legitimate website and you see the same thing happen, don’t worry that it’s something you did. It’s just a side effect of being too popular with the wrong sort of people, those who duplicated your content in various ways and didn’t link back to the source.

The ultimate irony is that eHow has more NOFOLLOWED links to my website than I have pages. So maybe it’s just guilt by one-way association.

Update: I understated the traffic loss because I was looking at all visitors, rather than just the U.S. where the algo change was rolled out. Looking at U.S. only, the loss is over 50%, and it’s clear that sites that get ripped-off are the ones taking it on the chin. So the duplicate content penalty, applied on a site-wide basis, really looks like the issue. Too bad Google can’t tell the creators from the thieves.

18 comments to Google Misses With Both Barrels And Duplicate Content Kills

  • Bryan

    Morris,

    Wow, this is big. Crazy big. I have a hard time believing that Google can’t at least figure out somewhat, who is the originator and who is the thief. Can’t they look at the date the content was originally introduced on the web, e.g. the first cached page they created for a given set of content? Also, it seems like they could implement a logic system in which they evaluate the totality of a site against another site to see which one has more content from other areas on the net.

    I also suspect that your 50% loss in traffic will not be permanent. Such a dramatic loss to a legit site is probably beyond the parameters of Google’s allowable margin of error, and while individual complaints from web masters probably won’t do anything, I suspect they will self correct at some point in the near future.

    My high traffic pages seem relatively unaffected over the last few days, so perhaps your pages where hand picked or at least picked as part of a campaign to target this type of problem. I wonder about your other readers — how many of their pages are affected?

    Something I find completely ridiculous is that eHow actually seemed to rise recently for a search phrase in which my site is #1, and now they are #2! This particular search phrase where I am #1, for this page of mine, it is my highest traffic page that I own, and it is – if I do say so myself – a darned good article with high quality original content. eHow’s rip off version of it, while not a copy of mine, is pathetically anemic in terms of quality content. If I had to rate it I would say it is about 10% as useful as my page. So while I am #1 and they are #2, I can’t help but wonder how Google allows their crap so high in the rankings. In fact it would seem contradictory to Google’s principles that a site gains rankings not just by having authority in general but having authority in a SPECIFIC topic and hence is given higher ranking in that category. eHow because it is spread so thin doesn’t have authority in anything and consequently shouldn’t be given high ranking in anything.

    This is quite unbelievable and so far to me represents perhaps one of the first major actions Google has taken that makes me think of them as an average and relatively inept technology company and not one that is “a cut above the rest.” In any case I’m sure we’ll be able to gather more data and analyze this further in the coming weeks.

    Bryan

  • Bryan,

    You’re right about eHow. I’ve been doing some searches that show eHow actually GOING UP in results, in some cases replacing my #1 results which were pushed down to the second page. Maybe I should have put Adsense on that site:-)

    Morris

  • Bryan

    eHow seems precisely opposite of the type of sites Google likes to reward… half baked land grabs by profit hungry low hanging fruit seekers. It is just one door away from spam, or perhaps shares a room with spam. Perhaps, though, Google sees it as the wikipedia of how-to’s? That’s actually an interesting comparison…in many ways it is a wikipedia of sorts. Maybe google feels what eHow lacks in quality it makes up for in consistency of writing style and page format? Still, hard to believe that Google doesn’t notice eHow rising in the rankings. If I didn’t know better I’d almost think that Google is in bed with eHow, and maybe they are in an indirect way… maybe a higher percentage of eHow pages have adsense on them than independent publisher pages. Actually that has to be the case, its the core of eHow’s business model whereas many quality web publishers don’t even know or care what adsense is.

    I smell evil……

  • Bryan,

    Maybe Google is just penalizing sites that link Amazon:-)

    I don’t expect Google to fix the problem if they are happy with the overall results.

    The duplicate content issue with Google was always something they denied and for which theyprobably couldn’t find a solution. I don’t think they could keep snapshots of the web forever and always run comparisons. They count on page rank and their basic algo to determine quality, but penalty filters trump everything.

    Hoepfully, it won’t affect a lot of publishers. Keep in mind that some of my most popular web pages are also in books, which are also ripped off, and appearing all over. Who knows, maybe Google is looking at library copies or even Google Books and assuming that I stole all of my content from there?

    Morris

  • [...] the rankings of lower quality content. We noticed Morris Rosenthal,owner of Foner Books had some grievances, expressed on his blog. We reached out to hear more of his story, and he was happy to share quite a [...]

  • Yet again the mighty Google does an update and destroys the rankings and no doubt the financial incomes of thousands of legitimate web businesses. They do it all in the name of relevancy and quality search results… What BS! The problem with Google is that with the launch of Adwords in October of 2000, Google became an advertising platform and that’s what Googles core business is today; advertising.

    I think maybe Bryan hit the nail on the head; “If I didn’t know better I’d almost think that Google is in bed with eHow, and maybe they are in an indirect way… maybe a higher percentage of eHow pages have Adsense on them than independent publisher pages…”. I wonder how much ad revenue Google gets from eHow running Adsense on it’s pages?

    With recent changes in Google’s local search offering, now with paid options, many ‘organic’ search results don’t even show above the fold.

    And everybody does the Google dance…

  • Makes one wonder if Google doesn’t use dates + times when evaluating sources of original content.

  • Stephen,

    I’ve made money with Adsense, the publishing side of Adwords, and recommended it to other publishers.

    http://www.fonerbooks.com/2010/01/advertising-revenue-for-book-publishers.html

    And I think most of the article farms that Google dumped used Adsense as will as eHow.

    I really suspect it’s down to two issues: duplicate content (I get ripped off all the time by “quality” sites, eHow doesn’t), and Googles new promotion of an aesthetic. They may not see it as an aesthetic, but I object to using the word quality to describe the mathematical formulation of the confidence human testers show random pages.

    Morris

  • Steven,

    A couple problems. First, my sites that were hit are ten years old and fifteen years old. The fifteen year old site is older than Google, and people stole as much in the 90′s as they do today. Second, to put this strategy in practice for the years they have been around, they would have had to have started collecting and dating every page on the web since they opened for business. No reason to believe they’ve done this, it would take a tremendous amount of storage, and then, there is such a thing as legitimate copying (just not from me:-)

    So I think they’ve generally counted on PageRank to determine who really owns content, at least that was my experience by in 2005 when I higher ranked site that stole my page got my original dropped from the index.

    Morris

  • Matt

    The problem is that Google identifies a problem (which might not really be much of a problem, really) and then fixes that problem.

    Then someone points out that the ‘fix’ actually caused other problems elsewhere in the algo, so then they crash in to ‘fix’ that problem, which then causes another problem… and so it goes.

  • Matt,

    They are in love with their algo. Google had a serious problem with returning content farm garbage. They turned to human testers for the fix, but they used them the wrong way. Google can certainly identify potential content farms through their algo. They should have had the human testers try to get answers from those content farms, and if they were as bad as most of us think they are, banned them from the index.

    But they want to do it all with math, and not understanding the math they are using doesn’t strike them as a problem. Makes me wonder about PhD programs for the last 10 years or so.

    Morris

  • [...] As bad as that sounds, it is actually even worse than that. Today Google Alerts showed our brand being mentioned on a group-piracy website built around a subscription model of selling 3rd party content without permission! As annoying as that feels, of course there are going to be some dirtbags on the way that you have to deal with from time to time. But now that the content farm update has went through, some of the original content producers are no longer ranking for their own titles, whereas piracy sites that stole their content are now the canonical top ranked sources! Google never used to put piracy sites on the first page of results for my books, this is a new feature on their part, and I think it goes a long way to show that their problem is cultural rather than technical. Google seems to have reached the conclusion that since many of their users are looking for pirated eBooks, quality search results means providing them with the best directory of copyright infringements available. And since Google streamlined their DMCA process with online forms, I couldn’t discover a method of telling them to remove a result like this from their search results, though I tried anyway. … I feel like the guy who was walking across the street when Google dropped a 1000 pound bomb to take out a cockroach – Morris Rosenthal [...]

  • [...] As bad as that sounds, it is actually even worse than that. Today Google Alerts showed our brand being mentioned on a group-piracy website built around a subscription model of selling 3rd party content without permission! As annoying as that feels, of course there are going to be some dirtbags on the way that you have to deal with from time to time. But now that the content farm update has went through, some of the original content producers are no longer ranking for their own titles, whereas piracy sites that stole their content are now the canonical top ranked sources! Google never used to put piracy sites on the first page of results for my books, this is a new feature on their part, and I think it goes a long way to show that their problem is cultural rather than technical. Google seems to have reached the conclusion that since many of their users are looking for pirated eBooks, quality search results means providing them with the best directory of copyright infringements available. And since Google streamlined their DMCA process with online forms, I couldn’t discover a method of telling them to remove a result like this from their search results, though I tried anyway. … I feel like the guy who was walking across the street when Google dropped a 1000 pound bomb to take out a cockroach – Morris Rosenthal [...]

  • [...] As bad as that sounds, it is actually even worse than that. Today Google Alerts showed our brand being mentioned on a group-piracy website built around a subscription model of selling 3rd party content without permission! As annoying as that feels, of course there are going to be some dirtbags on the way that you have to deal with from time to time. But now that the content farm update has went through, some of the original content producers are no longer ranking for their own titles, whereas piracy sites that stole their content are now the canonical top ranked sources! Google never used to put piracy sites on the first page of results for my books, this is a new feature on their part, and I think it goes a long way to show that their problem is cultural rather than technical. Google seems to have reached the conclusion that since many of their users are looking for pirated eBooks, quality search results means providing them with the best directory of copyright infringements available. And since Google streamlined their DMCA process with online forms, I couldn’t discover a method of telling them to remove a result like this from their search results, though I tried anyway. … I feel like the guy who was walking across the street when Google dropped a 1000 pound bomb to take out a cockroach – Morris Rosenthal [...]

  • [...] Much has been written and said about Google’s Panda – the algorithm upgrade in Feb. 2011 was designed to provide better relevant search results by filtering out content farms and other Web sites of questionable value. But Panda has its critics. Publisher and author Morris Rosenthal depends heavily on the Internet for his survival, and Panda has hurt him badly prompting him to air his criticisms. [...]

  • Laptop Guy

    It looks like Google fighting good publishers. :(
    Your site is clean and has very unique content. Nevertheless, according to Alexa your traffic has been going down.
    Same with my laptop repair sites.
    Something is wrong with Google. I think big changes are coming.

  • Laptop Guy,

    I’d say I’m down 65% to 70% on Google visitors at this point. What bothers me moreis how bad their results have gotten for the sorts of searches my site used to draw traffic on. It’s all forums and high-end article farms now. But I’m onto other things so it’s not my worry.

    Morris

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Fatal error: Internal zval's can't be arrays, objects or resources in Unknown on line 0