Did I Just Say That Duplicate Content Is Ok?
Wait, what? Did I just say that duplicate content is ok? And that Google doesn’t penalize sites for duplicate content?
I’ve written about this in the past (Top 3 Ways to Get Punished by Google) and yes… I’ve fallen for it. It’s so easy to say, “you will be penalized for duplicate content.” But the fact is you won’t. Well not exactly.
Yes, I know it sounds like a lot of doublespeak.
There is a strong myth around this topic mostly because so many people don’t understand what Google really does with duplicate content.
Let Me Break The Duplicate Content Myth Down For You…
- Duplicate content doesn’t cause your site to be penalized.
- Of course, Google knows webmasters and site owners are trying to diversify their exposure in search results, however, that doesn’t mean you will show up for the same thing repeatedly.
- What really happens is Google has structured its algorithm to look at clusters of content. If you have several pages that look like duplicate content, they will look at the best one will then be shown. The catch is it may not be what you think is the “best one.”
- With groups or clusters of content, Google also does webmasters a solid and combines their strength. So, don’t create duplicate content and then block search engines from the duplicate pages. That will hurt you. A better way to manage this is to use Pillar Content instead.
- They will also attempt to find the original source of the content. In that past sites that scrapped the internet for content used to do well, now they are hard to find since they are not the source of the original content.
Ok, That’s a Good List, But What Does That Really Mean?
Google will look at your content and if it finds duplicate content it will either pick one of the pages to show or worst case it says, “Hey, I don’t like any of these,” and your content isn’t shown at all. But this is not a penalty.
What Causes Duplicate Content Anyway?
- Moving from a non-secure site to a secure site. HTTP and HTTPS
- Improper configuration of www and non-www
- Using Session IDs in URLs
- Trailing slashes vs no trailing slashes
- Index pages
- Parameters and faceted navigation
- Alternate page versions such as m. or AMP pages or print
- Pagination, such as in blogs.
- Country/language versions
What Can You Do to Fix The Issue?
Well, that depends on your particular situation, right? But here are a few ideas.
- Do nothing and let Google handle it. Um, not the best choice.
- Rewrite your pages so they aren’t duplicate. Then define your content clusters and let Google know which content is Pillar content and what is supporting content.
- Canonical Tags: These tags are used to point back to the content that is your original or preferred content. Self-Canonical tags help stop scrappers from working too.
- Rel=”prev” and rel=”next”: Used for pagination.
- Setup server-side rules that force one page to be used: I.e. with trailing slash or not, www or no www, redirect/force domain.com/index.html to domain.com/
- HrefLang or Rel=”alternate.”: This is used to consolidate and identify different language pages of the same content. You can find out more info. from Google itself. I’ll also be writing about this in the future.