AI crawler audit checklist: prove your traffic funnel can be crawled
This checklist is designed for small sites that want to attract long-tail search traffic and AI-era discovery without manipulative tactics.
1. Crawl access
/robots.txtreturns 200.- Public pages are not blocked by
User-agent: *. - OAI-SearchBot policy is explicit if ChatGPT search visibility matters.
- Training-use crawlers are handled separately from search crawlers.
2. Discovery files
/sitemap.xmllists all launch pages with canonical HTTPS URLs./llms.txtloads as plain text and includes curated core pages.- Each sitemap URL returns 200 or a deliberate canonical redirect.
- Footer links expose sitemap, robots, and llms.txt where appropriate.
3. Page quality
- Each page answers one specific query or job.
- Every HTML page has a unique title, description, canonical, and visible H1.
- Claims about crawlers link to official documentation where possible.
- Generated content is reviewed for usefulness and accuracy.
4. Funnel measurement
- Generator usage is tracked as an activation event.
- Copy button clicks are tracked as a stronger activation event.
- Audit or contact clicks are tracked as conversion intent.
- Search Console is connected after launch so impressions can be measured.
Local proof: this site includes a verification script that checks core files, sitemap URLs, meta tags, internal links, robots rules, and tracking markers before deployment.