Broken Link Hijacking: How Broken Links Can Be a Security Risk

Links are the very foundation of the web. They connect web resources with each other and make it possible for visitors to navigate between pages and allow pages to reference images and other content.

Unfortunately, unlike diamonds, links are not forever. They have a tendency to break over time. Companies go out of business, servers are shut down, blog posts get deleted, domains expire… the web is dynamic, and there are lots of reasons why a link that works today might stop working tomorrow.

At best, a broken link is merely annoying and results in a poor user experience. At worst, it can pose a security threat to anyone visiting the website.

Imagine what could happen if Google shuts down their Analytics service and later lets the google-analytics.com domain expire. There would be millions of websites left with obsolete script code that attempts to load and run code from https://www.google-analytics.com/analytics.js. A third-party could snatch up the expired domain and serve malicious JavaScript code under this URL. This is one form of an attack called Broken Link Hijacking.

Broken Link Hijacking is an exploit in which an attacker gains control over the target of a broken link.

Typical candidates for link hijacking include:

  • Links to expired or parked domains that are available to register or purchase.
  • Links to deleted accounts on social media or blogging platforms that can be reclaimed.
  • Links to subdomains that are no longer in use and are vulnerable to a Subdomain Takeover.

Depending on how the hijacked link is embedded into the website’s code, there are different ways to exploit the vulnerability, with varying levels of risks.

Exploit Examples

Script Links

If you have embedded an external script into your website (using code like this: <script src="https://example.com/script.js"></script>) and the link’s domain name gets taken over, an attacker can inject arbitrary code into the site.

You might ask what harm could come from some extra JavaScript code. The answer is plenty. Here are a few examples of how an attacker could exploit this vulnerability:

  • Capture passwords, bank details, or other sensitive information the visitor enters on the site and send that info to another server.
  • Steal the session cookie to gain access to the visitor’s account.
  • Redirect the visitor to another website.
  • Look for and exploit any security vulnerabilities in the visitor’s browser or the browser’s plugins.
  • Use the visitor’s computer to mine cryptocurrencies (crypto-jacking).
  • Place ads on the website to generate money (ad-jacking).

The possibility to execute attacker-supplied code basically makes this a Stored Cross-Site Scripting (XSS) vulnerability, which Bugcrowd classifies as a P2 (high risk) issue.

Image and Style Sheet Links

A hijacked link to an image (<img src="https://example.com/image.jpg">) or style sheet (<link href="https://example.com/styles.css" rel="stylesheet">) is not as bad as a hijacked script link, but can still have serious security implications:

  • An attacker can use a hijacked image link to display offensive content meant to harm the website’s reputation, and can lead to penalties issued by the hosting provider or even law enforcement.
  • A hijacked link to a CSS file gives an attacker even more control over the website’s layout, including the ability to add and replace images (background: url("https://example.net/hacked.gif")) and to inject text (body::before { content: "HACKED!" }).

Attacks like these are often referred to as defacement or content spoofing and typically fall into Bugcrowd’s P4 (low risk) category.

It’s also worth noting that each request made to an attacker-controlled external server leaks information about both the website and the visitor. The attacker is able to track who visits the site (IP address, browser user-agent, referring website) and how often.

Page Links

When you link to an external page from your site (<a href="https://example.com/">Link</a>), this link can be seen as a recommendation. You are indicating that the content of the page is relevant and worth a visit, otherwise you wouldn’t have included the link as part of your own content.

Gaining access to the target of the link allows an attacker to exploit the trust that your visitors give you and your recommendation in order to:

  • Trick the visitor into entering account credentials or other sensitive information (phishing).
  • Trick the visitor into installing a malicious browser extension or app (drive-by download)
  • Spread misinformation or inappropriate content.

This is basically an impersonation attack. The attacker pretends that the linked website is legitimate and from a trusted source. Bugcrowd rates Impersonation via Broken Link Hijacking as P4 (low risk).

Prevention Strategies

Subresource Integrity

Subresource Integrity (SRI) allows you to ensure that linked scripts and style sheets are only loaded if they haven’t changed since the page was published. This is accomplished by computing a cryptographic hash of the content and adding it to the <script> or <link> element via the integrity attribute (as a base64-encoded string):

<script src="https://example.com/script.js" integrity="sha384-/u6/iE9tq+bsqNaONz1r5IjNql63ZOiVKQM2/+n/lpaG8qnTYumou93257LhRV8t" crossorigin="anonymous"><script>

Before executing a script or applying a style sheet, the browser compares the requested resource to the expected integrity hash value and discards it if the hashes don’t match.

Content Security Policy

By adding a Content-Security-Policy HTTP header to your server’s responses, you can restrict which domains resources can be loaded from:

Content-Security-Policy: default-src 'self' example.net *.example.org

In this example, resources (such as scripts, style sheets, images, etc.) may only be requested from the site’s own origin (self, excluding subdomains), example.net (excluding subdomains) and example.org (including subdomains). Requests to other origins are blocked by the browser.

A Content Security Policy doesn’t help when one of your trusted domains gets hijacked, but it does make sure that you don’t accidently embed resources from unexpected sources, whether that’s due to a simple typo or an obsolete link on an old and long-forgotten page.

Link Checks

Broken links happen. And when they do, it’s always better to know sooner than later, before an attacker might exploit the issue. Our link checker, Dr. Link Check, allows you to schedule regular scans of your website and notifies you of new link problems by email. Our crawler not only looks for typical issues like 404s, timeouts, and server errors, but also checks if links lead to parked domains.

Quite often, redirects are an early indicator that a link might break soon. When a website is redesigned and restructured, redirects are used to map the old URL structure to the new one. This typically works fine for the first redesign, but with each new restructure, the redirect chains get longer and longer, with more potential breaking points. It’s therefore advisable to keep a close eye on redirected links and update them if necessary.

In order to identify redirects on your website, run a scan in Dr. Link Check and click on one of the items in the Redirects section of the Overview report to see the details.

Redirects

Conclusion

A broken external link doesn’t just disrupt the visitor experience; it can also have serious security implications. An attacker might be able to hijack the broken link and gain control over the link’s target. In the worst case, this can lead to an account takeover and the theft of sensitive data.

Using modern browser security features like Subresource Integrity and Content Security Policy you can mitigate these risks. Regular crawls with a broken link checker help you identify broken links early and reduce the attack surface.


8 Things to Monitor on Your Website

Finishing the initial version of your site is only the first step on your journey as a website owner. Now that you have a website, tracking its vital statistics is crucial for success.

It’s easy to overlook the trivial things that negatively impact your website’s professionalism, security, Google rankings, and ultimately the revenue you make from it.

Luckily, there are a variety of tools and services that take the grunt work out of managing a website. When you maintain these eight essential elements of any successful website, your site’s odds of being successful will increase tremendously.

8 Things to Monitor on Your Website

1. Traffic Statistics

If your website is suddenly getting a lot more (or fewer) visitors, then it’s good to know when it started happening and why so that you can either capitalize on its newfound popularity or refresh your SEO strategy. It’s also useful to see which devices your users browse your site with, which sites they came from, and where they’re located.

User analytics packages (e.g., Google Analytics or Cloudflare Web Analytics) allow you to track a variety of statistics about the people who use your website, and what they do on it. Google Analytics can even send you an email alert if certain conditions are met (e.g., a sudden spike in traffic).

If you’re getting a lot of hits to your homepage but only a few purchases, then you can also see how much time users are spending on your site, and how many of them make it to each step in the process of buying something.

2. Broken Links

Links you make to other sites could suddenly stop working at any time if a website that you linked to is revamped, or a domain is sold to someone else. As bad as 404 errors are for your professionalism, the worst situation is when a website changes hands and redirects to something like a phishing site, or a parked domain full of ads.

To avoid having to manually check every link all the time, Dr. Link Check makes sure that all the links on your entire site (including images and external stylesheets) load correctly, have valid SSL certificates, aren’t on a blacklist of malware and phishing sites, and haven’t been parked.

After crawling every link across your whole website, Dr. Link Check prepares a searchable report and lets you download the data as a CSV file to do your own analytics.

3. Uptime and Performance

A website can’t be successful if it’s down, so services like Uptime Robot and Pingdom check your website’s status every few minutes to make sure it hasn’t encountered an outage. As soon as it does, these services will alert you via an SMS message, email, or various other contact methods, so you can get it working again as quickly as possible.

Uptime Robot can also check protocols other than HTTP/S and generate status pages. Pingdom includes a full performance monitoring and analytics solution, as well. Each will load your site from multiple locations to determine if an outage is only affecting people in a certain geographical area.

4. Security Vulnerabilities

Nothing erodes user trust and confidence quite like a security breach, so it’s of the utmost importance to avoid them entirely. Even if you follow good development practices and keep all your software up to date, it’s still possible to mess up somewhere, leaving an opportunity for a hacker to sabotage your business.

While automated tools aren’t a perfect substitute for a professional security audit, tools including Website Vulnerability Scanner, Mozilla Observatory, and WP-Scan (if you have a WordPress-based site) can help pinpoint configuration errors, XSS and SQL injection bugs, and outdated server software to keep your customers’ data secure.

Whether you simply have a few outdated plugins, or you forgot to sanitize user input in a hardly used form, an automated check can be the quickest way to find security bugs before hackers take advantage of them.

5. Search Rankings

Your site’s place in search results for any given search term is always changing. Therefore, it’s important to be notified if you suddenly slip off the first page of results for a specific query.

SERPWatcher and RankTrackr are services that check your site’s position in search results on a daily basis and send you a message when it suddenly changes. Additionally, both offer dashboards that display all the different keywords that lead to your site, and where your site has ranked for those searches over time.

Many of these services can also track interactions from social media sites and widgets, so you can completely understand how your users find your website.

6. Domain and Certificate Expiration

Forgetting to renew your SSL certificate is just as bad as your site going down unexpectedly, but with the added consequence that many users may lose trust in your site’s security. Worse, not renewing your domain on time could allow someone to buy it and use it for something else entirely.

To avoid potential customers being turned away by “your connection is not secure” errors, set a calendar reminder and use a service like CertsMonitor to make absolutely certain that you renew on time. Many registrars and certificate merchants offer auto-renew, as well, so you can truly “set it and forget it”.

7. SEO Issues on Your Pages

Did you forget to add a title to a page? Did you miss ALT attributes on a few images? No robots.txt? Search engines will drop your page’s position in the rankings if you don’t fix issues like these.

SEOptimer and RavenTools crawl your site and find every instance of SEO mistakes on every page. Implementing the suggestions from either tool can significantly boost your rankings on Google and other engines. Google itself also offers tools to identify issues and assess your site’s speed and mobile compatibility.

8. Backlinks

The PageRank algorithm deep within the heart of Google ranks sites based on the number and quality of links that point to them. The idea is that high-quality websites will be linked from many other well-ranking sites. When your website is linked from a reputable blog or goes viral on a social media platform, you’ll notice that your site is displayed more prominently in search results.

To get notified whenever you get linked from both good and bad sites, Monitor Backlinks will tell you when new links begin pointing to your site. It can also give insight into which websites would give you the most beneficial backlinks.

Conclusion

It’s easy to forget to monitor some vital sign on your website, leading to a significant loss of business. Therefore, using a service to address each of the areas that need to be monitored will allow you to focus on your business, instead of the boring tasks required to keep your website up and running.

From SEO concerns to security, uptime, and even link rot, you can count on these monitoring services to alert you when something goes wrong.


How to Find and Replace Old FTP Links

FTP, short for File Transfer Protocol, is an old standard for transferring files from one computer to another. The protocol was first proposed in 1971, long before the advent of the modern TCP/IP-based internet. In spite of its age, FTP is still commonly used, and hundreds of thousands of websites link to files stored on FTP servers (using URLs that start with ftp://).

Until recently, that wasn’t an issue. All major browsers had built-in support for FTP and were able to handle ftp:// links. This situation is changing. The developers of Chrome disabled FTP in version 88 (released on January 19, 2021), and it’s likely that other browsers will follow suit.

The rationale behind this decision is that FTP in its original form is an insecure protocol that doesn’t support encryption. This is understandable, but practically, it breaks existing ftp:// links for the majority of users.

If you want to make sure that your website is free of ftp:// links, follow the steps below.

1. Start a check at drlinkcheck.com

Go to https://www.drlinkcheck.com/, enter the URL of your website, and press the Start Check button.

Start Link Checker

2. Wait for the crawl to complete

Wait until the check is complete and the website is fully crawled. The number of found ftp:// links is displayed in the Link Schemes section. If there are no ftp: items in the list, the crawler didn’t locate any ftp:// links on your site and you are all good and can skip the rest of the post.

Number of ftp:// links

3. Access the list of FTP links

Click the ftp: item under Link Schemes to get to the list of ftp:// links and review each item in the list. If you hover over a link and hit the Details button, you can see which pages contain the link (under Linked from). A click on Source will show you the exact location in the HTML source code.

FTP link report

4. Replace or remove FTP links on your website

Now it’s time to decide what to do with the found links. Here are a few options:

  • Try replacing “ftp” with “https” or “http” in the URL. Many websites serve the same files via both FTP and HTTP(S).
  • Enclose the filename in quotes ("example-file-20200830.pdf") and enter it in Google. You may find an alternative https:// URL you can link to.
  • Use an FTP client (like WinSCP or Cyberduck) to download the file from the FTP server and re-upload it to your web server. This is only a valid option if you have permission to share the file publicly.
  • Add a note to the page that explains how to download the file using a dedicated FTP client.
  • Simply remove the link or even the entire page if it’s outdated and no longer relevant.

Wrapping it up

Forty years after its introduction, FTP is slowly being phased out as a protocol for serving files on the internet. With major browsers dropping support for FTP, now is a good time to clean up your website and get rid of all FTP links. You surely don’t want your website to appear outdated and broken.


How to Find All Nofollow/Dofollow Links on a Website

Links are one of the most important factors in how search engines determine the ranking of a website. If you place a link from your website to another, search engines consider this a vote for the relevance and quality of the linked content. Just like a recommendation to friends or family in real life, a link is something that you put your good name behind.

If you don’t want to endorse a link, adding rel="nofollow" tells search engines to ignore it when ranking pages. This makes sense for sites that you want your visitors to warn about (like a scam or a hack), user-generated links that you haven’t reviewed yet (like in the comments section of a blog), or links in ads that you were paid for placing on your website.

Considering the importance of marking links as nofollow (or not), it’s a good idea to periodically check if all your outbound links are correctly qualified. With a small website of only a few pages, you may be able to do this by hand, but larger sites require an automated solution. This is where Dr. Link Check comes into play. Our service is not only a great broken link checker, but can also give you an inventory of all the links on your website.

Step 1: Start a check at drlinkcheck.com

Head to the Dr. Link Check homepage, enter the address of your website, and click on Start Check.

Start Link Check

Step 2: Click on "Nofollow"

While crawling your website, Dr. Link Check displays the number of found Dofollow and Nofollow links under Dofollow/Nofollow on the Overview page. Click on Nofollow to list the found links.

Link Check Overview: Nofollow/Dofollow

Step 3: Add "Direction: Outbound" filter item

As you are probably only interested in links pointing to other websites, limit the report to outbound links by clicking the Add button in the Filter bar on the top and selecting Direction from the menu.

Add filter: Direction

Step 4: Add "Link type: <a href>" filter item

The list of links might also contain links found in image (<img src=...>) or script (<script src=...>) tags. If you want to restrict the report to normal page links, click again on Add and select Link type from the filter menu.

Add filter: Link type

With these two filters, you've now identified all outbound nofollow links on your website. If you want to see all dofollow links instead, simply click on Nofollow in the filter bar and select Dofollow from the menu.


Introducing the HtmlElement Rule Property

If you have a paid subscription (Standard plan and above), you can create rules for which links to follow and which links to ignore:

HtmlElement Rule Property

While previously a rule could only be based on the URL of a link (example: Url ENDSWITH ".gif"), it’s now also possible to include or exclude links based on where they are found in the HTML code. As an example, the following exclude rule makes our crawler ignore all <a> tags found inside HTML elements that have class="footer":

HtmlElement = ".footer a"

If you have worked with CSS before, the syntax should already be familiar to you. The table below lists the supported ways of selecting HTML elements:

Selector Description
.class Selects all elements that have the specified class
#id Selects an element based on the value of its ID attribute
element Selects all elements that have the specified tag name
[attr] Selects all elements that have an attribute with the specified name
[attr=value] Selects all elements that have an attribute with the specified name and value
[attr~=value] Selects all elements that have an attribute with the specified name and a value containing the specified word (which is delimited by spaces)
[attr|=value] Selects all elements that have an attribute with the specified name and a value equal to the specified string or prefixed with that string followed by a hyphen (-)
[attr^=value] Selects all elements that have an attribute with the specified name and a value beginning with the specified string
[attr$=value] Selects all elements that have an attribute with the specified name and a value ending with the specified string
[attr*=value] Selects all elements that have an attribute with the specified name and a value containing the specified string
* Selects all elements
A B Selects all elements selected by B that are inside elements selected by A
A > B Selects all elements selected by B where the parent is an element selected by A
A ~ B Selects all elements selected by B that follow an element selected by A (with the same parent)
A + B Selects all elements selected by B that immediately follow an element selected by A (with the same parent)
A, B Selects all elements selected by A and B

Equipped with this knowledge, it’s possible to construct quite powerful rules:

Example Matched HTML elements
HtmlElement = "#search > a, #filter > a" <a> tags directly under elements with the IDs search or filter
HtmlElement = "a[rel~=nofollow]" <a> tags with a rel="nofollow" attribute
HtmlElement = "img[src$=.png]" <img> tags with an src attribute value ending in .png
HtmlElement = ".comments *" Everything inside elements with a comments class
HtmlElement = "head > link[rel=alternate][hreflang=en-us]" All <link rel="alternate" hreflang="en-us"> elements directly inside the <head> tag

This feature has been in beta for a while and has proven to be quite a valuable addition. We hope you will find it as useful as we do. If you run into any problems or have a suggestion, please drop us a note.


40 of the Most Beautiful 404 Pages on the Web

As the owner of Dr. Link Check, a broken link checker app, I have a love-hate relationship with 404 pages. On the one hand, I do everything I can to help our customers find and fix broken links and keep website visitors from running into 404 errors. On the other hand, I truly enjoy a unique and creative 404 page that sets a positive tone and lightens the mood.

While there are still too many websites that use the plain and dull default pages that come with their web servers, some custom 404 pages are gorgeous masterpieces. These hidden error pages are often the places where web designers can show their creativity and express themselves outside the bounds of what was specified by corporate.

For this article, I modified our crawler to search the web for the most beautiful 404 pages. The crawler went through a total of 500,000 websites, reducing the number of candidates to a few thousand, which I then manually narrowed down to the 40 pages presented below.

I hope these designs will give you the inspiration you’ll need to create your own distinctive 404 page. Crafting a pretty error page will not only be fun, but also makes sense from a customer conversion point of view. An appealing 404 page design can take away some of the bitterness of the broken link that a user just encountered. There’s a good chance the user will be pleasantly surprised and stay on your site rather than immediately hitting the back button and never visiting you again.

Now, enough with the chit-chat and on to why you’re actually here: the 404 pages.

1. flippingbook.com – The lost page

flippingbook.com 404 page

2. travel-wise.com – Trip from Madagascar

travel-wise.com 404 page

3. freshservice.com – Ventured too far

freshservice.com 404 page

4. managewp.com – Not the droid you’re looking for

managewp.com 404 page

5. fpsjob.com – Off the map

fpsjob.com 404 page

6. stremio.com – The cat with the wool

stremio.com 404 page

7. gumgum.com – It’s just a mouse

gumgum.com 404 page

8. wm.com – The recycled page

wm.com 404 page

9. insurancecanopy.com – Just like the Yeti

insurancecanopy.com 404 page

10. gamehag.com – Sim sala bim

gamehag.com 404 page

11. zyro.com – Lost in space (1)

zyro.com 404 page

12. inap.com – Lost in space (2)

inap.com 404 page

13. morningscore.io – Lost in space (3)

morningscore.io 404 page

14. sunscrapers.com – Lost in space (4)

sunscrapers.com 404 page

15. foobot.io – That stinks

foobot.io 404 page

16. cognitiveseo.com – Relaxing on the beach

cognitiveseo.com 404 page

17. smartrecruiters.com – Gone too far

smartrecruiters.com 404 page

18. tutorialzine.com – Catman to the rescue

tutorialzine.com 404 page

19. groundhogg.io – Not all who wander are lost

groundhogg.io 404 page

20. freshdesk.com – The secret land of four-oh-four

freshdesk.com 404 page

21. triphistoric.com – This page is history

triphistoric.com 404 page

22. mychooice.com – Hmm, where could it be

mychooice.com 404 page

23. snowys.com.au – Gone walkabout

snowys.com.au 404 page

24. kkday.com – The broken bridge

kkday.com 404 page

25. avast.com – IT department at work

avast.com 404 page

26. nexcess.net – The missing page

nexcess.net 404 page

27. dewaweb.com – Lost in the clouds

dewaweb.com 404 page

28. magicpress.net – Not the end of the world

magicpress.net 404 page

29. adbadger.com – You lost your way

adbadger.com 404 page

30. ventraip.com.au – Dinosaur food

ventraip.com.au 404 page

31. freepik.com – Outside of the universe

freepik.com 404 page

32. troopmessenger.com – Stranded on an island

troopmessenger.com 404 page

33. fork-cms.com – Lost at sea

fork-cms.com 404 page

34. readme.com – Dropped ice cream

readme.com 404 page

35. hopper.com – Lost in the stratosphere

hopper.com 404 page

36. godream.com – Gone with the wind

godream.com 404 page

37. freelancercareers.org – The confused robot

freelancercareers.org 404 page

38. ncino.com – At least the view here is nice

ncino.com

39. syndicode.com – The sad cat

syndicode.com 404 page

40. gdgsoft.com – Trapped in ice

gdgsoft.com 404 page

Wrapping It Up

Compiling this list of 404 pages was a fun thing for me to do. I hope you enjoyed the article and found some inspiration for your own website projects. If you already have a unique 404 page that you’re proud of, or have designed one based on this article, please message me via Twitter @wulfsoft and tell me about it. I’d love to see your creations.


Ältere Posts