What is Crawl Budget?
Crawl budget is the number of pages a search engine bot will crawl on your site within a given timeframe. It's determined by crawl rate limit (how fast without overloading your server) and crawl demand (how important your pages are).
For most small-to-medium sites (under 10,000 pages), crawl budget isn't a concern — Google crawls everything. For larger sites (ecommerce, news, large blogs), crawl budget becomes critical.
Common crawl budget waste: faceted navigation generating millions of parameter URLs, session IDs in URLs, internal search result pages, paginated archive pages, old/expired content that should be removed or consolidated.
Conservation strategies: robots.txt to block low-value paths, noindex on thin/duplicate pages, flat site architecture, internal linking to important pages, XML sitemap with only canonical URLs.
Example
An ecommerce site with 50,000 products but 2 million URLs from faceted navigation (color/size/sort combinations) wastes 97% of crawl budget on low-value pages.