Saturday, October 4, 2008

Duplicate Content and Search Engines

Duplicate content is something that every serious webmaster is trying to avoid. Basically, duplicate content means that some specific content, like an article, exists in a few locations on Internet. Search engines often penalize websites for displaying duplicate content, as usually duplicate content provides a smaller value for Internet users. On the other hand, original content is something considered to be very valuable.

However, it seems that detecting duplicate content not always works as well as it should. From time to time, I hear bloggers reporting that articles on their site has been marked as duplicate content, because scraper websites have copied those articles to their own domains. If the duplicate content filter was working correctly, then the scraper sites would be marked as sites with duplicate content, not the opposite.

Generally speaking, scrapper sites are automated websites which pull content from other websites using web scraping. A lot of those sites are made for Google AdSense, but it is important to understand that a lot of them violate copyright law, as they do not even give credit to the original author. The webmasters of such sites do not put any work into writing articles, yet they might be ranked quite high in search engine results.

Some speculate that many search engines fail to realize on which site the content was posted in the first place. The websites which copy/paste content from other sites, but are indexed earlier might be actually treated as legit sites. While it is alarming, let’s hope that the major search engines will repair those content filters, as we do not want our content just copied/pasted to other websites. I hope that this article reminded all of us that original content is something we should strive for, and that copying copy/pasting content from other websites should be avoided at all cost.

No comments:


