1. Create an XML sitemap.
Remember that site structure we went over? That belongs in something called an XML Sitemap that helps search bots understand and crawl your web pages. You can think of it as a map for your website. You’ll submit your sitemap to Google Search Console and Bing Webmaster Tools once it’s complete. Remember to keep your sitemap up-to-date as you add and remove web pages.
2. Maximize your crawl budget.
Your crawl budget refers to the pages and resources on your site search bots will crawl.
Because crawl budget isn’t infinite, make sure you’re prioritizing your most important pages for crawling.
Here are a few tips to ensure that you’re maximizing your crawl budget:
- Remove or canonicalize duplicate pages.
- Fix or redirect any broken links.
- Make sure your CSS and Javascript files are crawlable.
- Check your crawl stats regularly and watch for sudden dips or increases.
- Make sure any bot or page you’ve disallowed from crawling is meant to be blocked.
- Keep your sitemap updated and submit it to the appropriate webmaster tools.
- Prune your site of unnecessary or outdated content.
- Watch out for dynamically generated URLs, which can make the number of pages on your site skyrocket.
3. Optimize your site architecture.
Your website has multiple pages. Those pages need to be organized in a way that allows search engines to easily find and crawl them. That’s where your site structure — often referred to as your website’s information architecture — comes in.
In the same way that a building is based on architectural design, your site architecture is how you organize the pages on your site.
Related pages are grouped together; for example, your blog homepage links to individual blog posts, which each link to their respective author pages. This structure helps search bots understand the relationship between your pages.
Your site architecture should also shape, and be shaped by, the importance of individual pages. The closer Page A is to your homepage, the more pages link to Page A, and the more link equity those pages have, the more importance search engines will give to Page A.
For example, a link from your homepage to Page A demonstrates more significance than a link from a blog post. The more links to Page A, the more “significant” that page becomes to search engines.
Conceptually, a site architecture could look something like this, where the About, Product, News, etc. pages are positioned at the top of the hierarchy of page importance.
Make sure the most important pages to your business are at the top of the hierarchy with the greatest number of (relevant!) internal links.
4. Set a URL structure.
URL structure refers to how you structure your URLs, which could be determined by your site architecture. I’ll explain the connection in a moment. First, let’s clarify that URLs can have subdirectories, like blog.hubspot.com, and/or subfolders, like hubspot.com/blog, that indicate where the URL leads.
As an example, a blog post titled How to Groom Your Dog would fall under a blog subdomain or subdirectory. The URL might be www.bestdogcare.com/blog/how-to-groom-your-dog. Whereas a product page on that same site would be www.bestdogcare.com/products/grooming-brush.
Whether you use subdomains or subdirectories or “products” versus “store” in your URL is entirely up to you. The beauty of creating your own website is that you can create the rules. What’s important is that those rules follow a unified structure, meaning that you shouldn’t switch between blog.yourwebsite.com and yourwebsite.com/blogs on different pages. Create a roadmap, apply it to your URL naming structure, and stick to it.
Here are a few more tips about how to write your URLs:
- Use lowercase characters.
- Use dashes to separate words.
- Make them short and descriptive.
- Avoid using unnecessary characters or words (including prepositions).
- Include your target keywords.
Once you have your URL structure buttoned up, you’ll submit a list of URLs of your important pages to search engines in the form of an XML sitemap. Doing so gives search bots additional context about your site so they don’t have to figure it out as they crawl.
5. Utilize robots.txt.
When a web robot crawls your site, it will first check the /robot.txt, otherwise known as the Robot Exclusion Protocol. This protocol can allow or disallow specific web robots to crawl your site, including specific sections or even pages of your site. If you’d like to prevent bots from indexing your site, you’ll use a noindex robots meta tag. Let’s discuss both of these scenarios.
You may want to block certain bots from crawling your site altogether. Unfortunately, there are some bots out there with malicious intent — bots that will scrape your content or spam your community forums. If you notice this bad behavior, you’ll use your robot.txt to prevent them from entering your website. In this scenario, you can think of robot.txt as your force field from bad bots on the internet.
Regarding indexing, search bots crawl your site to gather clues and find keywords so they can match your web pages with relevant search queries. But, as we’ll discuss later, you have a crawl budget that you don’t want to spend on unnecessary data. So, you may want to exclude pages that don’t help search bots understand what your website is about, for example, a Thank You page from an offer or a login page.
No matter what, your robot.txt protocol will be unique depending on what you’d like to accomplish.
6. Add breadcrumb menus.
Remember the old fable Hansel and Gretel where two children dropped breadcrumbs on the ground to find their way back home? Well, they were on to something.
Breadcrumbs are exactly what they sound like — a trail that guides users to back to the start of their journey on your website. It’s a menu of pages that tells users how their current page relates to the rest of the site.
And they aren’t just for website visitors; search bots use them, too.
Breadcrumbs should be two things: 1) visible to users so they can easily navigate your web pages without using the Back button, and 2) have structured markup language to give accurate context to search bots that are crawling your site.
Not sure how to add structured data to your breadcrumbs? Use this guide for BreadcrumbList.
7. Use pagination.
Remember when teachers would require you to number the pages on your research paper? That’s called pagination. In the world of technical SEO, pagination has a slightly different role but you can still think of it as a form of organization.
Pagination uses code to tell search engines when pages with distinct URLs are related to each other. For instance, you may have a content series that you break up into chapters or multiple webpages. If you want to make it easy for search bots to discover and crawl these pages, then you’ll use pagination.
The way it works is pretty simple. You’ll go to the <head> of page one of the series and use
rel=”next” to tell the search bot which page to crawl second. Then, on page two, you’ll use rel=”prev” to indicate the prior page and rel=”next” to indicate the subsequent page, and so on.
It looks like this…
On page one:
<link rel=“next” href=“https://www.website.com/page-two” />
On page two:
<link rel=“prev” href=“https://www.website.com/page-one” />
<link rel=“next” href=“https://www.website.com/page-three” />
Note that pagination is useful for crawl discovery, but is no longer supported by Google to batch index pages as it once was.
8. Check your SEO log files.
You can think of log files like a journal entry. Web servers (the journaler) record and store log data about every action they take on your site in log files (the journal). The data recorded includes the time and date of the request, the content requested, and the requesting IP address. You can also identify the user agent, which is a uniquely identifiable software (like a search bot, for example) that fulfills the request for a user.
But what does this have to do with SEO?
Well, search bots leave a trail in the form of log files when they crawl your site. You can determine if, when, and what was crawled by checking the log files and filtering by the user agent and search engine.
This information is useful to you because you can determine how your crawl budget is spent and which barriers to indexing or access a bot is experiencing. To access your log files, you can either ask a developer or use a log file analyzer, like Screaming Frog.
Just because a search bot can crawl your site doesn’t necessarily mean that it can index all of your pages. Let’s take a look at the next layer of your technical SEO audit — indexability.
Indexability Checklist
As search bots crawl your website, they begin indexing pages based on their topic and relevance to that topic. Once indexed, your page is eligible to rank on the SERPs. Here are a few factors that can help your pages get indexed.
Indexability Checklist
- Unblock search bots from accessing pages.
- Remove duplicate content.
- Audit your redirects.
- Check the mobile-responsiveness of your site.
- Fix HTTP errors.
1. Unblock search bots from accessing pages.
You’ll likely take care of this step when addressing crawlability, but it’s worth mentioning here. You want to ensure that bots are sent to your preferred pages and that they can access them freely. You have a few tools at your disposal to do this. Google’s robots.txt tester will give you a list of pages that are disallowed and you can use the Google Search Console’s Inspect tool to determine the cause of blocked pages.
2. Remove duplicate content.
Duplicate content confuses search bots and negatively impacts your indexability. Remember to use canonical URLs to establish your preferred pages.
3. Audit your redirects.
Verify that all of your redirects are set up properly. Redirect loops, broken URLs, or — worse — improper redirects can cause issues when your site is being indexed. To avoid this, audit all of your redirects regularly.
4. Check the mobile-responsiveness of your site.
If your website is not mobile-friendly by now, then you’re far behind where you need to be. As early as 2016, Google started indexing mobile sites first, prioritizing the mobile experience over desktop. Today, that indexing is enabled by default. To keep up with this important trend, you can use Google's mobile-friendly test to check where your website needs to improve.
5. Fix HTTP errors.
HTTP stands for HyperText Transfer Protocol, but you probably don’t care about that. What you do care about is when HTTP returns errors to your users or to search engines, and how to fix them.
HTTP errors can impede the work of search bots by blocking them from important content on your site. It is, therefore, incredibly important to address these errors quickly and thoroughly.
Since every HTTP error is unique and requires a specific resolution, the section below has a brief explanation of each, and you’ll use the links provided to learn more about or how to resolve them.
- 301 Permanent Redirects are used to permanently send traffic from one URL to another. Your CMS will allow you to set up these redirects, but too many of these can slow down your site and degrade your user experience as each additional redirect adds to page load time. Aim for zero redirect chains, if possible, as too many will cause search engines to give up crawling that page.
- 302 Temporary Redirect is a way to temporarily redirect traffic from a URL to a different webpage. While this status code will automatically send users to the new webpage, the cached title tag, URL, and description will remain consistent with the origin URL. If the temporary redirect stays in place long enough, though, it will eventually be treated as a permanent redirect and those elements will pass to the destination URL.
- 403 Forbidden Messagesmean that the content a user has requested is restricted based on access permissions or due to a server misconfiguration.
- 404 Error Pages tell users that the page they have requested doesn’t exist, either because it’s been removed or they typed the wrong URL. It’s always a good idea to create 404 pages that are on-brand and engaging to keep visitors on your site (click the link above to see some good examples).
- 405 Method Not Allowed means that your website server recognized and still blocked the access method, resulting in an error message.
- 500 Internal Server Error is a general error message that means your web server is experiencing issues delivering your site to the requesting party.
- 502 Bad Gateway Erroris related to miscommunication, or invalid response, between website servers.
- 503 Service Unavailable tells you that while your server is functioning properly, it is unable to fulfill the request.
- 504 Gateway Timeout means a server did not receive a timely response from your web server to access the requested information.
Whatever the reason for these errors, it’s important to address them to keep both users and search engines happy, and to keep both coming back to your site.
Even if your site has been crawled and indexed, accessibility issues that block users and bots will impact your SEO. That said, we need to move on to the next stage of your technical SEO audit — renderability.
Renderability Checklist
Before we dive into this topic, it’s important to note the difference between SEO accessibility and web accessibility. The latter revolves around making your web pages easy to navigate for users with disabilities or impairments, like blindness or Dyslexia, for example. Many elements of online accessibility overlap with SEO best practices. However, an SEO accessibility audit does not account for everything you’d need to do to make your site more accessible to visitors who are disabled.
We’re going to focus on SEO accessibility, or rendering, in this section, but keep web accessibility top of mind as you develop and maintain your site.
Renderability Checklist
An accessible site is based on ease of rendering. Below are the website elements to review for your renderability audit.
Server Performance
As you learned above, server timeouts and errors will cause HTTP errors that hinder users and bots from accessing your site. If you notice that your server is experiencing issues, use the resources provided above to troubleshoot and resolve them. Failure to do so in a timely manner can result in search engines removing your web page from their index as it is a poor experience to show a broken page to a user.
HTTP Status
Similar to server performance, HTTP errors will prevent access to your webpages. You can use a web crawler, like Screaming Frog, Botify, or DeepCrawl to perform a comprehensive error audit of your site.
Load Time and Page Size
If your page takes too long to load, the bounce rate is not the only problem you have to worry about. A delay in page load time can result in a server error that will block bots from your webpages or have them crawl partially loaded versions that are missing important sections of content. Depending on how much crawl demand there is for a given resource, bots will spend an equivalent amount of resources to attempt to load, render, and index pages. However, you should do everything in your control to decrease your page load time.
JavaScript Rendering
Google admittedly has a difficult time processing JavaScript (JS) and, therefore, recommends employing pre-rendered content to improve accessibility. Google also has a host of resources to help you understand how search bots access JS on your site and how to improve search-related issues.
Orphan Pages
Every page on your site should be linked to at least one other page — preferably more, depending on how important the page is. When a page has no internal links, it’s called an orphan page. Like an article with no introduction, these pages lack the context that bots need to understand how they should be indexed.
Page Depth
Page depth refers to how many layers down a page exists in your site structure, i.e. how many clicks away from your homepage it is. It’s best to keep your site architecture as shallow as possible while still maintaining an intuitive hierarchy. Sometimes a multi-layered site is inevitable; in that case, you’ll want to prioritize a well-organized site over shallowness.
Regardless of how many layers in your site structure, keep important pages — like your product and contact pages — no more than three clicks deep. A structure that buries your product page so deep in your site that users and bots need to play detective to find them are less accessible and provide a poor experience
For example, a website URL like this that guides your target audience to your product page is an example of a poorly planned site structure: www.yourwebsite.com/products-features/features-by-industry/airlines-case-studies/airlines-products.
Redirect Chains
When you decide to redirect traffic from one page to another, you’re paying a price. That price is crawl efficiency. Redirects can slow down crawling, reduce page load time, and render your site inaccessible if those redirects aren’t set up properly. For all of these reasons, try to keep redirects to a minimum.
Once you've addressed accessibility issues, you can move onto how your pages rank in the SERPs.
Rankability Checklist
Now we move to the more topical elements that you’re probably already aware of — how to improve ranking from a technical SEO standpoint. Getting your pages to rank involves some of the on-page and off-page elements that we mentioned before but from a technical lens.
Remember that all of these elements work together to create an SEO-friendly site. So, we’d be remiss to leave out all the contributing factors. Let’s dive into it.
Internal and External Linking
Links help search bots understand where a page fits in the grand scheme of a query and gives context for how to rank that page. Links guide search bots (and users) to related content and transfer page importance. Overall, linking improves crawling, indexing, and your ability to rank.
Backlink Quality
Backlinks — links from other sites back to your own — provide a vote of confidence for your site. They tell search bots that External Website A believes your page is high-quality and worth crawling. As these votes add up, search bots notice and treat your site as more credible. Sounds like a great deal right? However, as with most great things, there’s a caveat. The quality of those backlinks matter, a lot.
Links from low-quality sites can actually hurt your rankings. There are many ways to get quality backlinks to your site, like outreach to relevant publications, claiming unlinked mentions, providing relevant publications, claiming unlinked mentions, and providing helpful content that other sites want to link to.
Content Clusters
We at HubSpot have not been shy about our love for content clusters or how they contribute to organic growth. Content clusters link related content so search bots can easily find, crawl, and index all of the pages you own on a particular topic. They act as a self-promotion tool to show search engines how much you know about a topic, so they are more likely to rank your site as an authority for any related search query.
Your rankability is the main determinant in organic traffic growth because studies show that searchers are more likely to click on the top three search results on SERPs. But how do you ensure that yours is the result that gets clicked?
Let’s round this out with the final piece to the organic traffic pyramid: clickability.
Clickability Checklist
While click-through rate (CTR) has everything to do with searcher behavior, there are things you can do to improve your clickability on the SERPs. While meta descriptions and page titles with keywords do impact CTR, we’re going to focus on the technical elements because that’s why you’re here.
Clickability Checklist
- Use structured data.
- Win SERP features.
- Optimize for Featured Snippets.
- Consider Google Discover.
Ranking and click-through rate go hand-in-hand because, let’s be honest, searchers want immediate answers. The more your result stands out on the SERP, the more likely you’ll get the click. Let’s go over a few ways to improve your clickability.
1. Use structured data.
Structured data employs a specific vocabulary called schema to categorize and label elements on your webpage for search bots. The schema makes it crystal clear what each element is, how it relates to your site, and how to interpret it. Basically, structured data tells bots, “This is a video,” “This is a product,” or “This is a recipe,” leaving no room for interpretation.
To be clear, using structured data is not a “clickability factor” (if there even is such a thing), but it does help organize your content in a way that makes it easy for search bots to understand, index, and potentially rank your pages.
2. Win SERP features.
SERP features, otherwise known as rich results, are a double-edged sword. If you win them and get the click-through, you’re golden. If not, your organic results are pushed down the page beneath sponsored ads, text answer boxes, video carousels, and the like.
Rich results are those elements that don’t follow the page title, URL, meta description format of other search results. For example, the image below shows two SERP features — a video carousel and “People Also Ask” box — above the first organic result.
While you can still get clicks from appearing in the top organic results, your chances are greatly improved with rich results.
How do you increase your chances of earning rich results? Write useful content and use structured data. The easier it is for search bots to understand the elements of your site, the better your chances of getting a rich result.
Structured data is useful for getting these (and other search gallery elements) from your site to the top of the SERPs, thereby, increasing the probability of a click-through:
- Articles
- Videos
- Reviews
- Events
- How-Tos
- FAQs (“People Also Ask” boxes)
- Images
- Local Business Listings
- Products
- Sitelinks
3. Optimize for Featured Snippets.
One unicorn SERP feature that has nothing to do with schema markup is Featured Snippets, those boxes above the search results that provide concise answers to search queries.
Featured Snippets are intended to get searchers the answers to their queries as quickly as possible. According to Google, providing the best answer to the searcher’s query is the only way to win a snippet. However, HubSpot’s research revealed a few additional ways to optimize your content for featured snippets.
4. Consider Google Discover.
Google Discover is a relatively new algorithmic listing of content by category specifically for mobile users. It’s no secret that Google has been doubling down on the mobile experience; with over 50% of searches coming from mobile, it’s no surprise either. The tool allows users to build a library of content by selecting categories of interest (think: gardening, music, or politics).
At HubSpot, we believe topic clustering can increase the likelihood of Google Discover inclusion and are actively monitoring our Google Discover traffic in Google Search Console to determine the validity of that hypothesis. We recommend that you also invest some time in researching this new feature. The payoff is a highly engaged user base that has basically hand-selected the content you’ve worked hard to create.
The Perfect Trio
Technical SEO, on-page SEO, and off-page SEO work together to unlock the door to organic traffic. While on-page and off-page techniques are often the first to be deployed, technical SEO plays a critical role in getting your site to the top of the search results and your content in front of your ideal audience. Use these technical tactics to round out your SEO strategy and watch the results unfold.