Burp Scanner offers numerous settings that control how scans behave during the crawl phase. You can select these settings when you create or edit scan configurations in Burp Suite Professional or Burp Suite Enterprise Edition.
These settings enable you to tune Burp Scanner's behavior during the crawl phase, to reflect the objectives of the audit and the nature of the target application.
The following settings are available:
Burp Scanner skips the unauthenticated crawl phase if you have provided one or more application logins for it to use. It uses only your provided logins and does not attempt to self-register users or trigger login failures. This can reduce the overall crawl time.
If you don't provide any application logins, the crawler automatically performs an unauthenticated crawl instead.
Specify the maximum number of navigational transitions (clicking links and submitting forms) that the crawler can make from the start URL(s).
Modern applications tend to build navigation into every response, for example in menus and page footers. As such, it is normally possible to reach the vast majority of an application's content and functionality within a small number of hops from the start URL. Fully covering multi-stage processes (such as viewing an item, adding it to a shopping cart, and checking out) requires more hops.
Some applications contain extremely long navigational sequences that don't lead to interesting functionality. For example, a shopping application might have a huge number of product categories, sub-categories, and view filters. To a crawler, this can appear as a very deep nested tree of links, all returning different content. However, there are clearly diminishing returns to crawling deeply into a navigational structure such as this. It's sensible to limit the maximum link depth to a smaller number.
Real-world applications differ hugely in the way they organize content and navigation, the volatility of their responses, and the extent and complexity of the application state involved.
At one extreme, a largely stateless application may:
On the other hand, a heavily-stateful application might use:
The crawler can handle all of these cases. However, this imposes an overhead in the quantity of work involved in the crawl. The crawl strategy setting enables you to tune the approach taken to specific applications.
The default crawl strategy represents a trade-off between speed and coverage that is appropriate for typical applications. However, when you crawl an application with more stable URLs and no stateful functionality, you may want to select the Faster or Fastest setting. When you crawl an application with more volatile URLs or more complex stateful functionality, you may want to select the More complete or Most complete setting.
The Fastest crawl strategy differs from the other crawl strategies in some important ways:
Burp Scanner uses cookies from the cookie jar as initial values. This has a significant impact on authenticated crawling:
Crawling modern applications is sometimes an open-ended exercise due to stateful functionality, volatile content, and unbounded navigation. It's sensible to configure a limit to the extent of the crawl, based on your knowledge of the application being scanned. Burp Scanner uses various techniques to maximize discovery of unique content early in the crawl, to help minimize the impact of limiting the crawl length.
You can limit the crawl based on:
These settings control how the crawler interacts with login functionality during the crawl.
These settings are not compatible with recorded login sequences. When using recorded logins for a scan, the Login functions settings are ignored.
You can select whether the crawler should:
These settings control how Burp Scanner handles application errors that arise during the crawl phase of the scan, such as connection failures or transmission timeouts.
You can configure the following options:
You can leave any setting blank to deselect it.
These settings enable you to specify timeout values for the crawl. These values override any you may have configured in the global settings.
These settings enable you to customize some additional details of the crawl:
robots.txt file and extract links from it.
sitemap.xml file and extract links from it. You can configure the maximum number of items to extract.
/ \ ? = &. However, you can use this setting to control this function manually.
These settings enable you to control the behavior of Burp's browser:
If you watch the crawl in a headed browser, you may see the crawler open multiple windows and stop using existing ones. This is expected behavior and is not indicative of any issues with the scan. Any redundant windows close automatically after a certain period of time.