About the link crawler

Mabl includes a default link crawler plan ("Check all pages for broken links and errors") that contains one test: "Visit all linked pages within the app" for each application in your workspace. The plan is configured to run against all of your application environments every week. Running the default link crawler plan does not count towards the monthly quota.

If you opted the include credentials to access your application when you added the application, the link crawler will use those credentials to automatically log in to your application each time it runs at test.

You can change the configuration of the default link crawler plan(s) as you would any other plan.


Mabl visits all linked pages within the domain by default

With no configuration, mabl will visit all linked pages within your domain. That means that if you configure an environment with URL https://app.example.com, mabl will visit not only pages within https://app.example.com (such as https://app.example.com/page1) but also any other pages that are linked to example.com within your app. Often, this results in mabl crawling documentation sites (https://docs.example.com) and marketing sites (https://www.example.com).

When running the link crawler, mabl begins by loading the page at the base URL of your environment. After collecting information about that page (including a screenshot load time, JavaScript error information, and more), mabl will check the first hyperlink on that page. If the hyperlink is within the domain of your base URL, mabl will visit that page and collect the information noted above. mabl will then proceed to visit the second link on the original page, and then the third, and so forth. mabl will proceed to visit all of the linked pages in your application in this manner and generate a broken link report for links that it finds that do not return the expected 2xx response.


Broken links

mabl will report links as broken for the following reasons:

  • The landing page has a 404 error
  • The page did not fully load while mabl was crawling it
  • The page returned a non-2xx error such as a 503, 4xx, etc.
  • The URL was already checked and marked as "broken" in the same run


What counts as a link?

The link crawler searches for links by checking the webpage for <a> HTML elements, also known as anchor elements. Other HTML elements that are intended to perform page navigation on click are not considered links in link crawler tests.

Links are first validated with a GET or HEAD request, ensuring a non-error HTTP status code is returned. Then, valid links are only loaded into the browser for further analysis and link crawling if the link returns one of the following content types:

  • "text/html"
  • "application/xhtml+xml"
  • "application/xml"

Using a custom login flow

By default, the mabl link crawler can accept an auto-login flow that attempts to automatically log into your app before it begins to crawl links. In certain cases, the auto-login flow may not always be sufficient to complete authentication, such as cases where Captcha is present, or if there is more than just an email and password field.

In this case, a custom login flow is needed to properly crawl the app under test. To add one of these flows, duplicate the existing link crawler test. Once duplicated, view the steps list. You will have an option at the top to select a custom login flow. This can be any flow in your app, just make sure to select one that logs in properly.

This new link crawler can then be added to any plan, or run manually.