— May 13, 2026

Multi-domain crawler hardening

The URL crawler now respects robots.txt, enforces plan-tier page caps, and detects JS-rendered pages so the content you indexed actually matches what visitors see.

The URL crawler got a significant hardening pass. Three things changed:

*robots.txt compliance.** The crawler now fetches and parses robots.txt before indexing any page. If a path is disallowed for our user-agent, we skip it silently and log it so you can see what was excluded.

*Plan-tier page caps.** Free plans are capped at 3 pages per crawl run, Starter at 25, Pro at 100, Business at 500. Previously there was no server-side cap and a misconfigured bot could hit a large domain and ingest thousands of irrelevant pages.

*JS-rendered page detection.** If a page returns near-empty HTML because it's client-side rendered, we detect that and fall back to a headless fetch path so the content you indexed matches what a real visitor sees.

Commit: 09181ee

— Keep reading

MORE FROM THE changelog.

← Older: Customer webhook notifications Newer: Per-route rate limits →All entries→