How to Identify and Remedy Duplicate Content Issues on Your Website

Kieran Flanagan
Kieran Flanagan

Updated:

Published:

It's easy to be fooled into thinking SEO is just about link building. There are so many posts covering the latest developments on what links are good or bad, that we sometimes forget about the huge gains we can make by simply fixing problems with our own site. 

duplicate content chain broken

One of the biggest culprits for lost traffic and rankings is duplicate content. Luckily, you have control over your own site, so you have the power to fix it.

Access Now: 21 SEO Myths to Leave Behind in 2021

What Is Duplicate Content?

Duplicate content exists when there is more than one version of a page indexed by the search engines. Where there are multiple versions of a page indexed, it’s difficult for search engines to decide what page to show for a relevant search query.

Search engines aim to provide users with the best experience possible, which means they will rarely show duplicate pieces of content. Instead, they will be forced to choose what version they feel is the best fit for that query. 

Causes of Duplicate Content

Three of the biggest offenders for causing duplicate content are:

1) URL Parameters

URLs can often contain additional parameters because of how they are being tracked (marketing campaign IDs, analytics IDs), or the CMS a website is using adds its own custom parameters.

For example, the following URLs could all lead to the same page:

http://www.example.com/page1

http://www.example.com/page1?source=organic

http://www.example.com/page1?campaignid=3532

2) Printer friendly pages

Often a web page will have an option to produce a printer friendly version of that page. This can often lead to duplicate content issues. For example, the following URLs would lead to the same page.

http://www.example.com/page1

http://www.example.com/printer/page1

3) Session IDs

Sites may often want to track a user's session across their website. For example, sites can offer personalized features based upon who that user is and their past interactions with the site, or an ecommerce store may remember what that person added to their shopping cart on their last visit.

Session ids get appended to the URL and this causes duplicate versions of a page to exist. For example, the following URLs would lead to the same page.

http://www.example.com/page1

http://www.example.com/page1?sessionid=12455

Duplicate Content Problems

The biggest issues caused by duplicate content are:

  • Search engines don’t know which version of the page they should index
  • Search engines don’t know what page the link authority should be assigned to, or if it should be divided across multiple versions.
  • Search engines don’t know what version of the page to rank for a relevant search query.

This can result in web pages losing both rankings and organic traffic.

Finding Duplicate Content

There are two tools you can use to find duplicate content problems for your site: Google Webmaster Tools and Screaming Frog.

1) Google Webmaster Tools

Using Google Webmaster Tools you can easily find pages with both duplicate titles and meta descriptions. You simply click on “HTML Improvements” under “Search Appearance”.

Google_Webmaster_Tools_-_HTML_Improvements

Clicking on one of these links will show you what pages have duplicate meta descriptions and page titles. 

duplicate-content_2

2) Screaming Frog

You can download the Screaming Frog web crawler and use it to crawl 500 pages for free. This application lets you do a lot of different things, including finding duplicate content problems.

Page Titles/Meta Descriptions 

You can find duplicate page titles by simply clicking on the tab “Page Titles” or “Meta Description” and filtering for “Duplicate.”

Screaming_Frog_-_Duplicate_Page_and_Meta

URLs

You can also find pages that have multiple URL versions by simply clicking on the “URL” tab and sorting by “Duplicate."

Screaming_Frog_Duplicate_Pages

For a complete guide on all the different things you can do with Screaming Frog, check out this post from SeerInteractive.

Fixing Duplicate Content

Duplicate content is a problem that can impact both your organic traffic and web rankings, but it’s something that you can easily fix. The three quickest ways to address duplicate content problems are:

1) Canonical Tag 

Using the canonical tag you can tell search engines what version of a page you want to return for relevant search queries. The canonical tag is found in the header of a web page.

Canonical_Tag

The canonical tag is the best approach when you want to have multiple versions of a page available to users. If you're using the HubSpot COS, this will be taken care of automatically, so no manual labor required.

2) 301 Redirect

A 301 redirect will redirect all legacy pages to a new URL. It tells Google to pass all the link authority from these pages to the new URL and to rank that URL for relevant search queries.

The 301 redirect is the best option when you don’t have any need for multiple versions of a page to be available.

3) Meta Tags 

You can use meta tags to tell search engines not to index a particular page.

<html>

<head>

<title>…</title>

<Meta Name=”Robots” Content=”noindex, nofollow”>

</head>

Meta tags work best when you want that page to be available to the user but not indexed, e.g. terms and conditions. 

Duplicate content is a real problem for sites, but one that can be easily solved using the advice above. If you want to learn more about duplicate content, watch this video series from the SEO experts at Dejan SEO on how you can fix it for your site.

New Call-to-action

 New Call-to-action
Topics: Technical SEO

Related Articles

We're committed to your privacy. HubSpot uses the information you provide to us to contact you about our relevant content, products, and services. You may unsubscribe from these communications at any time. For more information, check out our Privacy Policy.

Boost your website performance with this free 3-park starter pack.