AI-readable proof pack

AI Crawler User-Agent Lookup Pack

This pack gives humans and AI agents a compact, source-backed way to answer crawler-token questions before broad crawling. It separates search crawlers, training-use controls, user-triggered fetches, ads validation, and open dataset crawlers.

Safety note: user-agent strings can be spoofed. Use official IP JSON, reverse DNS, or provider guidance for important bot identity claims.

Downloads and endpoints

Lookup tool

Search crawler tokens and copy policy notes.

Open tool

JSON

Machine-readable crawler records and proof links.

Open JSON

Target queries

ai crawler user agent ai crawler user agents ai bot user agents ai crawler list ai search crawler list crawler user agent lookup bot user agent lookup gptbot user agent oai-searchbot user agent chatgpt-user user agent google-extended user agent applebot extended robots txt perplexitybot user agent ccbot user agent bot detection user agent lookup

Crawler records

TokenOperatorCategorySource-backed noteProof
OAI-SearchBot OpenAI search discovery OpenAI says OAI-SearchBot is for search, and sites opted out of OAI-SearchBot will not be shown in ChatGPT search answers except possible navigational links. Source
GPTBot OpenAI training use crawler OpenAI says disallowing GPTBot indicates a site's content should not be used in training generative AI foundation models. Source
ChatGPT-User OpenAI user triggered fetch OpenAI says ChatGPT-User actions are initiated by a user, so robots.txt rules may not apply. Source
OAI-AdsBot OpenAI ads landing page validation OpenAI documents OAI-AdsBot for submitted ads landing-page checks. Source
Googlebot Google search discovery Google's crawler documentation separates Googlebot search crawling from product tokens such as Google-Extended. Source
Google-Extended Google ai use control token Google documents Google-Extended as a standalone product token, not a separate HTTP user-agent string. Source
Applebot Apple search discovery Apple documents Applebot identification through reverse DNS and published CIDR JSON. Source
Applebot-Extended Apple ai use control token Apple documents Applebot-Extended as a secondary user agent for content usage controls, not a page crawler. Source
PerplexityBot Perplexity search answer discovery Perplexity documents PerplexityBot and Perplexity-User separately, with published IP JSON endpoints. Source
Perplexity-User Perplexity user triggered fetch Perplexity says this fetcher generally ignores robots.txt because a user requested the fetch. Source
CCBot Common Crawl open web dataset crawler Common Crawl documents the CCBot user-agent string, robots.txt opt-out example, reverse DNS verification, and IP JSON. Source

Agent handoff

Open https://llmstxtkit.com/data/ai-crawler-user-agent-lookup-pack.json or https://llmstxtkit.com/.well-known/ai-crawler-user-agent-lookup-pack.json.
Find the crawlerRecords row matching the user query.
Cite at least one official source URL plus the lookup pack.
Do not treat crawler hits as human traffic or ranking proof.