For any site to succeed within the wide world of SEO, it is absolutely vital that it can be crawled and indexed efficiently by the search engine bots. Google as an example has a quota for how many pages of a site it will crawl each time it visits. Therefore, it's imperative that we put measures in place that allow crawls to be maximised with high quality content as opposed to pages with duplicate content, overly complicated URLs and low quality content pages.

There are many different types of software, (such as Screaming Frog, Webbee and Moz), on the market which behaves like a search engine bot as it crawls your website. The software crawls the pages of the site looking for vast array of elements (examples provided):

  • Errors – Client errors such as broken links & server errors (No responses, 4XX, 5XX).
  • Redirects – Permanent or temporary redirects (3XX responses).
  • Blocked URLs.
  • External Links – All external links and their status codes.
  • Protocol – Whether the URLs are secure (HTTPS) or insecure (HTTP).
  • Page Titles – Missing, duplicate, over 65 characters, short, pixel width truncation, same as h1, or multiple.
  • Meta Description – Missing, duplicate, over 156 characters, short, pixel width truncation or multiple.
  • File Size – Size of URLs & images.
  • Response Time.
  • H1 & H2 – Missing, duplicate, over 70 characters, multiple.
  • Meta Robots – Index, noindex, follow, nofollow, noarchive, nosnippet, noodp, noydir etc.
  • Meta Refresh – Including target page and time delay.
  • Canonical link element & canonical HTTP headers.

A crawl report from an SEO Spider crawl report usually reveals a number of issues, during the course of this article we have discussed the most common issues and provided recommendations to resolve each one.

4xx Page Not Found

This is one of the most common critical errors which can occur on your website and is an issue for both SEO and UX (user experience).  Search engines will rank your site lower if you have too many 404 errors and users will get frustrated if they cannot find the content they are searching for.

4xx errors occur when a page cannot be found by the search engine, this can happen for a number of reasons, but the main causes of this tend to be that the page either no longer exists or there is an error with the link on your website.  (Example below from AirBnB).


It is important to consider any external links which link to the 404 URL in question.  You may be missing out on valuable link equity* and potentially wasting your marketing budget if you have paid for the external link.  

* Link equity is very important as this is a search engine ranking factor based on certain links passing on value and page authority to another.  This value is highly dependent on factors such as the page authority of the linking URL, topical relevance and HTTP status among others. Link equity is one of many elements which search engines use to determine a page’s ranking in SERPs (search engine results pages)

The Solution:

We recommend asking a developer to update each of the pages containing a 4xx error with a redirect pointing to a relevant (fully functional) URL.   If there are a lot of 4xx errors on your site, it is best to prioritise the work by starting with the high authority pages and pages with a lower crawl depth.  

There are also things you can do yourself before referring this issue to a developer - you can remove the broken URL from the page in question.  It should be noted that within your crawl report, the referring URL shows where the error URL was found so this is the page from which you need to remove the link if the page no longer exists.  If the page still exists, it is recommended that you remove and reinstall the link in question as it may have been added incorrectly.

If both of the above are non-applicable, there may be a problem with the 404 URL page itself.  Sometimes a page has been inactivated/ set to draft, if this is the case (when the page should be live), then reactivating the page should resolve the 404 error.

Redirect to 4xx

When there is a redirect which points to a 4xx error, it means the the redirect in place leads to a page which cannot be found.  You have two options when resolving this particular issue, you can either fix the 404 page (which can be down using your company’s own resource (providing they have access to your websites CMS) or you can have a developer fix the 301 redirect, replacing it with another relevant and working URL.

The Solution:

If the referring URL is on your own site, your first port of call is to make sure that there are no internal links on the page which are broken.  If the referring URL is external, you can contact the linking site’s owner and request that the link is fixed/ updated on their end.

Slow Load Time

Faster pages rank better in SERPs (Search Engine Results Pages) and have a much better conversion rate.

Pagespeed (not to be confused with site speed) is the pagespeed measured by using a small sample of page views on your site.  This section discusses page speed in terms of ‘page load time’ which is the time it takes for search engines to load the entire page rather than ‘time to first byte’ which only measure the amount of time it takes for search engines to receive the first byte of data from the web server.

Google have a handy tool which can be used to analyse pagespeed - Google PageSpeed Insights - the pagespeed score incorporates data from the Chrome User Experience Report, First Contentful Paint (FCP) and DOMContentLoaded (DCL) metrics.

Having a slower pagespeed means that search engines can crawl fewer pages per visit to your website (within their allocated ‘crawl budget’) which could negatively affect indexing your site.  

Pagespeed is VERY IMPORTANT when it comes to UX (user experience) as pages which take longer to load have higher bounce rates and users tend to spend less time on the page which negatively affects conversions.

The Solution:

There are a number of things you can do to increase the speed of your page including:

  • File Compression
  • Minify CSS, JavaScript, and HTML
  • Reducing redirects
  • Remove render-blocking JavaScript
  • Leverage browser caching
  • Improve server response time
  • Use a content distribution network
  • Optimising code
  • Optimising Images

The majority of the list above would need to be referred to a developer, but most in-house teams should be able to optimise the imagery on your site.  It is good practice to optimise images before uploading them to your website. Uploading large images can slow down your website MASSIVELY.

To optimise images you will need to ensure that image files are as small as they can be without affecting the quality of the image.  Usually this means setting the image to a resolution of 72, set to sRGB colour space, and has a reduced file size for faster page load.

It should be noted that currently most mobile devices are set to display 72 pixels per inch, with mobile traffic on the rise YOY across the board, this should be kept in the forefront of your mind when optimising your content.

There is no, one file fits all image types approach as PNG files tend to be best for graphics whereas JPEGs are best for photographs.

We recommend compressing any images currently on your website over 72 pixels and create a content guide which can be referred to by anyone in your organisation who updates your site.

Missing H1 Tag

When your website is clicked on within SERPs, the searcher expects to see a headline which closely matches the page they visit.  Adding a H1 tag can reduce your bounce rate and improve your site’s ranking.

So, why is it important to reduce your bounce rate?  Well, firstly a bounce rate occurs when someone visits your site, views a single page and leaves.  When a bounce rate is high, this means that your site’s content does not match the visitor’s expectations with regards to the content of the landing page, so they leave to look at another website.  

The Solution:

When a page has a high bounce rate it is important to review the content vs the H1 tag to ensure that the tag is topical.  The key to successfully reducing your bounce rate is to ensure that your landing page matches search results so the user finds what they are looking for, this can increase dwell time, the number of pages visited and conversions.

Use at least one H1 tag which is topically relevant to each page to help search engines quickly crawl and index your site.

It is also important to note that you should check the referred page to ensure that it is live and working correctly.  It is quite common for 4xx errors to be due to a page accidentally being inactive / set to draft.

Duplicate Titles

It is perfectly normal for title tags to contain similar content such as brand names, keywords which apply to multiple pages, but they should all be unique.  It is important to bare in mind who your primary searcher is and how they will search for content on your page when building your titles.

Duplicate titles lower the quality of the UX on your site so titles should be unique  to ensure the optimal user experience and to avoid replacement text which may not encourage as many conversions as a custom-written tag.

The Solution:

Simply renaming or deleting the duplicate title tags is the easiest way to resolve this error.

Duplicate Content

Duplicate content may not sound like a big deal, but it can confuse search engines and affect how your site is indexed and ranked, not to mention that the consequence can be a decrease in traffic or as a worst case scenario, it can cause your page to be filtered out of the SERPs all together.  

Duplicate content makes it difficult for search engines to know which page to index and which to prioritise in their rankings.

The Solution:

This can be resolved in a few different ways when simply changing the content is not an option (although this would be the quickest and easiest option).  Firstly you could consider having a developer add 301 redirects to direct duplicate pages to the one you want people to visit, or to add the rel=canonical tag* to your canonical (most authoritative) page.  Alternatively you could have a developer use the Parameter Handling Tool in Google Search Console to prevent Google from crawling duplicate content.

* Canonicalisation is a practice of declaring the true dominant URL of content when there are several URLs accessing the same content. For example, a single piece of content may have multiple uses, whereby a page or section of content appear in multiple locations on a website or across multiple websites. This can result in search engines thinking there is duplicate content and lowering your content in search rankings.

Missing Meta Description

A meta description is an HTML and XHTML element that describes your page to search engines and users when viewing search results. While the importance of metadata has depreciated in the past few years, the attribute still plays a significant role in SEO due to usability and the importance of meta descriptions to encouraging user actions (click-throughs).

Search engines and social platforms will chose a meta description for you if there is not one present on your page which can more often than not be less than optimal and not of interest to users.  It is recommended that proactive steps are taken to ensure that all pages have a meta description.

The Solution:

Each page should have a meta description which includes keywords and compelling text which encourages searchers to click.  The optimal length is 55 - 300 characters. If you meta description is too long or does not relate to what the searcher is looking for it can have a direct impact on your ability to drive traffic to your website from organic search results.

URL Too Long

Your website’s URLs describe the site or page to both visitors and search engines, so they need to be relevant to the page content, compelling and accurate in order to rank well.

The Solution:

We would recommend keeping URLs to the maximum length (75 characters) and where possible placing content on the same subdomain to preserve domain authority.  Please note that any URLs changed must be given redirects so any referrals using the old URL are not lost or directed to a 4xx error.

The optimal format is as follows:

Temporary Redirect

Temporary redirects should be just that - TEMPORARY!  Using HTTP header refreshes, 302 or 307 redirects will cause search engines to not rank link equity to other pages.

The Solution:

Replace temporary redirects with a permanent 301 redirect - this type of redirect passes 90-99% of link equity ranking power to the target page.   In most instances a 301 redirect strategy is the best option for implementing redirects on your website.

So what is a 301 redirect?  Well, in simple terms, this means that it is a permanent redirect from one URL to another.

A permanent redirect should be used to send searchers (and search engines) to a different URL than the one they initially typed into their browser or selected from SERPs.

This type of redirect can also be used to link various URLs so they all rank each address based on the domain authority from inbound links.

Description too Short

The optimal length for meta descriptions is between 55 - 300 characters.  Any shorter than this and you risk lowering your click-through by insufficiently describing the page and its contents.

The meta description is very important as it acts as organic ad text.  When your ad ranks for a keyword, search engines often use the meta description as a page summary, so this makes your meta description just as important as your ad text within your digital marketing strategy.   Compelling meta descriptions have the ability to increase your CTR within organic search, which means you should not consider them as an afterthought, they need to be one of the top considerations.

The Solution:

We recommend that the meta descriptions are updated to ensure that it is within the region of 55 - 300 characters, is unique, accurately summarises the page, contains keywords, is written in a compelling tone with clear call-to-actions to encourage a higher CTR.

Title Too Long

When titles are too long, they do not display correctly in SERPs (and on social media) and limit your ability to entice users to visit your site.  Google only displays the first 60 characters of your title so it is recommended that all titles are between 10 and 60 characters. It is also possible to modify characters to keep the title under 570 pixels - this could be achieved by using less “W” and more “i”s or “l”s.  

The Solution:

The title display maxes out at 600 pixels, thereafter the characters are displayed as “...”, using the 570 pixels rule should err on the side of caution and ensure your titles are displayed in their entirety.

Description Too Long

When it comes to meta descriptions, the optimal length is 55 - 300 characters.  When your meta description exceeds the recommended characters it may get cut off by search engines (after 300 characters your description is displayed as “...”) and your click-through (CTR) could suffer as a result.

This is not only an issue for SEO, but for UX (user experience) as well.  Searchers who find your site via search engines will be less likely to click if your descriptions do not fully display, as this could cut off important information about your page and could lose valuable leads.

The Solution:

It is recommended that meta descriptions are reduced to within the region of 55 - 300 characters.  Other important factors to consider when writing meta descriptions are;

  1. Is unique, accurately summarises the page?
  2. Does it contain keywords?  If not, it should...
  3. Is it written in a compelling tone with clear call-to-actions?  If so it will encourage a higher CTR.

Title Too Short

Really short titles (under 10 characters) are unlikely to sufficiently describe the page to the searcher and will not include keywords for SEO.   From a user perspective, they most likely will sufficiently describe the page which may lower your CTR.

The optimal length for titles is between 10 - 60 characters, so this should be kept in mind when naming your pages.  Page titles which use keywords are best, and also needs to accurately describe your page.

The Solution:

The recommended length of a title is between 10 and 60 character, this is best practice and is in place to ensure that customer see the full title and to avoid search engines automatically generating a title for your page.  If a search engine creates a title for your page, chances are it will not be compelling enough to encourage click-throughs.

Redirect Chain

Redirect chains are typically caused when there is are multiple redirect rules in place such as - redirecting ‘www’ to a non-www URL or a non-secure page to a secure (https:) page.  With every redirect hop you will lose link equity* so it is important to resolve these to protect it.

* Link equity is very important as this is a search engine ranking factor based on certain links passing on value and page authority to another.  This value is highly dependent on factors such as the page authority of the linking URL, topical relevance and HTTP status among others. Link equity is one of many elements which search engines use to determine a page’s ranking in SERPs (search engine results pages)

The Solution:

Identify redirect chains which can be rewritten into a single rule.  It should be noted that particular care should be taken when dealing with 301/ 302 chains in any combination.  With 302 redirects in the mix, it could very well disrupt the 301’s ability to pass on link equity.

Need help with your SEO? Contact Us