Avoiding Duplicate Content Issues when Migrating to HTTPS
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

Avoiding Duplicate Content Issues when Migrating to HTTPS

If you’re not careful, you could create a headache for your SEO team.

We like to say that installing SSL is a fairly easy process. And it is. But there are some mistakes you can make that prove costly. Typically we toss out the example of bad key pinning. And that can be a problem. But one of the biggest issues sites face when they install SSL is their migration to HTTPS.

Remember, you’re not really creating an entirely new website, but if you do things incorrectly Google will mistakenly think you have, and at least in the interim, it’s going to dock you for having duplicate content. That’s because you’re going to be serving your entire site over a different protocol. HTTP and HTTPS are obviously different, one is secure with encrypted connections between clients and servers – the other isn’t.

Google sees these two URLS:

  • https://example.com
  • http://example.com

As two different pages with duplicate content. They are technically two different pages, too. So, how do we avoid this problem?

How do I avoid Google seeing my http and https pages as duplicate content?

You need to use 301 redirects on all of your HTTP pages to point to their HTTPS counterparts. This is an excellent time to remind you that the best practice is to enable SSL on every page of your site. All pages should be served over HTTPS. Having your visitors jump from a secure connection to a non-secure one and then back is not ideal. It puts extra pressure on your server because the handshake is an expensive process and it also opens attack vectors to exploit.

Your competitors can use your misconfiguration against you

That’s right, some servers will still serve pages via HTTPS, even without a security certificate. As we discussed, Google views that as duplicate content. So, hypothetically, if a competitor links to your HTTP site using the HTTPS protocol, it can get Google to start indexing your content as duplicate.

Then there are servers that won’t even serve pages if they’re not using HTTPS and there are no redirects. So the same tactic, linking to your HTTP site with HTTPS links, can create an error message, “Site can’t be reached,” and that’s also going to be harmful.

WWW or non-WWW?

You need to make a choice when you’re migrating – frankly you should have probably already made it, but definitely during migration – as to whether you want to serve your website with or without the WWW. That’s because to Google:

  • https://example.com
  • https://www.example.com

Are two different pages. WWW is actually considered a sub-domain and though most SSL certificates will cover both WWW and non-WWW variations, browsers don’t view it the same way. So pick one and redirect from the other, lest you confuse Google.

Some tips for protecting against Duplicate Content

Here are some suggestions to help you avoid duplicate content errors when you’re migrating to HTTPS

  • Canonical Tags – Even with redirects, marking your intended page as canonical will help tell Google which page to display in its search results.
  • Test your server – How does your server respond to requests for secure and insecure links? You may need to add more 301s to compensate.
  • Audit your URLs – Use a tool (there are both free and paid ones) to review your URLs for any duplicate content errors.
  • Check for 404s – This is just good hygiene, use Google Search Console to find and remedy any 404 errors your site is producing.

We hope this helps, and if you have any comments or questions, leave them in the comments section.

Author

Patrick Nohe

Patrick started his career as a beat reporter and columnist for the Miami Herald before moving into the cybersecurity industry a few years ago. Patrick covers encryption, hashing, browser UI/UX and general cyber security in a way that’s relatable for everyone.