How to Find Soft 404 Errors on Your Website

A soft 404 is a type of error where a web server returns a 200 OK status code (indicating that the request succeeded), even though the delivered page doesn’t contain the expected content, and a 404 Not Found status would have been the appropriate response.

Think of a page with no or very little content, a page with an error message, or a search results page without any results – that’s what a soft 404 error looks like to you in the browser, despite the server sending a status code 200 in the HTTP response headers as if there were no problem.

Why Are Soft 404 Errors Problematic?

Soft 404 errors create a bad user experience, just like regular 404 errors. Clicking on a link, waiting for the page to load, and then not finding the expected content is frustrating and gravely damages the website’s credibility.

It can also impact the site’s search rankings if users encounter a soft 404 and quickly leave the page. Bounce Rate and Time on Page are two important metrics that influence a website’s SEO performance and signal to search engines how relevant and valuable the content is. In addition, soft 404s consume valuable crawl resources, causing search engines to continue crawling unimportant pages instead of important ones, leading to a reduced frequency of crawls, decreased indexation, and ultimately, a negative impact on the website’s search visibility.

How to Detect Soft 404 Errors?

Detecting soft 404 errors is tricky. You can’t trust the HTTP status code returned by the server but have to examine the page content. Standard link checkers don’t do this and therefore fail to identify soft 404s.

Our link checking solution, on the other hand, can rely on a large database of content patterns to automatically identify different kinds of soft 404s on a website. Starting with the Professional plan, detected soft errors are reported under the “Soft errors” tab in the sidebar.

Soft Errors in Dr. Link Check

An alternative (and free) way to identify at least some of the soft 404 errors is to check out the site’s “Indexing → Pages” report in Google Search Console. This report lists crawl errors, including soft 404s, that Google encountered when indexing your pages.

Google Search Console: Soft 404

Another resource you should take a look at is your website’s analytics data. Try finding pages with particularly high bounce rates or low time-on-page values as these are indicators of soft 404 errors.

Last but not least, verify that your server actually sends a 404 status code if a non-existent resource is requested:

  • Open a new browser window or tab.
  • Enter a URL with your website’s domain that you are certain should result in a 404 Not Found error (such as https://www.example.com/this-page-does-not-exist).
  • Open the browser’s developer tools (Control + Shift + I on Windows or Linux, Command + Option + I on macOS).
  • Select the “Network” tab and press Control + R (or Command + R on macOS) to reload the page.
  • Check which code was returned for the page in the “Status” column.

Chrome DevTools: 404

If the request was not redirected to a different URL and the server responded with code 200, you have stumbled upon a soft 404 error.

What Causes Soft 404 Errors?

Soft 404 errors are frequently the result of an incorrect server configuration or a programming error. Here are two real-life examples:

A website hosted on an Apache web server had a line similar to this in its .htaccess file to configure a custom 404 error page:

htaccess ErrorDocument 404 https://www.example.com/404.html

Instead of serving the content of the 404.html file directly, the server redirected to the URL https://www.example.com/404.html and returned the 404.html file with a 200 OK status. Changing the line to

htaccess ErrorDocument 404 /404.html

fixed the issue.

In a different case, a website had a custom “404 Not Found” page with the following PHP code at the top:

php+HTML <?php header("Status: 200 OK"); ?>

This line resulted in 200 OK being sent instead of the correct 404 code.

Sometimes soft 404s are also remnants of changed website structures or removed content. Products that are no longer available may result in empty search result pages or moved blog posts in empty categories. In situations like these, it can be a good idea to just remove the empty pages or the links pointing to them.

If that’s not possible or practical, you can restrict search engines from indexing the pages by adding a disallow rule to your site’s robots.txt file or including a meta robots tag with the parameter “noindex” (<meta name="robots" content="noindex">) in your pages’ HTML code.

Conclusion

Soft 404 errors can significantly impact a website’s user experience and search engine visibility. Website owners can identify these errors through the use of tools such as Dr. Link Check and Google Search Console and by carefully examining the website’s analytics. Resolving soft 404s may involve reviewing the server’s configuration files and delving into the website’s source code.


Older Post Newer Post