As search bots crawl your website, they begin indexing pages based on their topic and relevance to that topic. Once indexed, your page is eligible to rank on the SERPs. Here are a few factors that can help your pages get indexed.
Indexability Checklist
- Unblock search bots from accessing pages.
- Remove duplicate content.
- Audit your redirects.
- Check the mobile-responsiveness of your site.
- Fix HTTP errors.
1. Unblock search bots from accessing pages.
You’ll likely take care of this step when addressing crawlability, but it’s worth mentioning here. You want to ensure that bots are sent to your preferred pages and that they can access them freely. You have a few tools at your disposal to do this. Google’s robots.txt tester will give you a list of pages that are disallowed and you can use the Google Search Console’s Inspect tool to determine the cause of blocked pages.
2. Remove duplicate content.
Duplicate content confuses search bots and negatively impacts your indexability. Remember to use canonical URLs to establish your preferred pages.
3. Audit your redirects.
Verify that all of your redirects are set up properly. Redirect loops, broken URLs, or — worse — improper redirects can cause issues when your site is being indexed. To avoid this, audit all of your redirects regularly.
4. Check the mobile-responsiveness of your site.
If your website is not mobile-friendly by now, then you’re far behind where you need to be. As early as 2016, Google started indexing mobile sites first, prioritizing the mobile experience over desktop. Today, that indexing is enabled by default. To keep up with this important trend, you can use Google's mobile-friendly test to check where your website needs to improve.
5. Fix HTTP errors.
HTTP stands for HyperText Transfer Protocol, but you probably don’t care about that. What you do care about is when HTTP returns errors to your users or to search engines, and how to fix them.
HTTP errors can impede the work of search bots by blocking them from important content on your site. It is, therefore, incredibly important to address these errors quickly and thoroughly.
Since every HTTP error is unique and requires a specific resolution, the section below has a brief explanation of each, and you’ll use the links provided to learn more about or how to resolve them.
- 301 Permanent Redirects are used to permanently send traffic from one URL to another. Your CMS will allow you to set up these redirects, but too many of these can slow down your site and degrade your user experience as each additional redirect adds to page load time. Aim for zero redirect chains, if possible, as too many will cause search engines to give up crawling that page.
- 302 Temporary Redirect is a way to temporarily redirect traffic from a URL to a different webpage. While this status code will automatically send users to the new webpage, the cached title tag, URL, and description will remain consistent with the origin URL. If the temporary redirect stays in place long enough, though, it will eventually be treated as a permanent redirect and those elements will pass to the destination URL.
- 403 Forbidden Messagesmean that the content a user has requested is restricted based on access permissions or due to a server misconfiguration.
- 404 Error Pages tell users that the page they have requested doesn’t exist, either because it’s been removed or they typed the wrong URL. It’s always a good idea to create 404 pages that are on-brand and engaging to keep visitors on your site (click the link above to see some good examples).
- 405 Method Not Allowed means that your website server recognized and still blocked the access method, resulting in an error message.
- 500 Internal Server Error is a general error message that means your web server is experiencing issues delivering your site to the requesting party.
- 502 Bad Gateway Erroris related to miscommunication, or invalid response, between website servers.
- 503 Service Unavailable tells you that while your server is functioning properly, it is unable to fulfill the request.
- 504 Gateway Timeout means a server did not receive a timely response from your web server to access the requested information.
Whatever the reason for these errors, it’s important to address them to keep both users and search engines happy, and to keep both coming back to your site.
Even if your site has been crawled and indexed, accessibility issues that block users and bots will impact your SEO. That said, we need to move on to the next stage of your technical SEO audit — renderability.