Blocking OAI-SearchBot by accident
A broad User-agent: * block or copied anti-AI template can accidentally block the crawler tied to ChatGPT search visibility.
OpenAI documents multiple crawlers, and the important distinction is now practical: OAI-SearchBot is the crawler to manage ChatGPT search visibility, while GPTBot is a separate training-use policy decision.
OAI-SearchBot. Then decide separately whether to allow or block GPTBot based on your content licensing, training-use, and risk policy.
| Crawler | Documented purpose | What blocking can mean | Default for public discovery |
|---|---|---|---|
OAI-SearchBot |
OpenAI search crawler used to surface websites in ChatGPT search features. | Your pages may be excluded from ChatGPT search answers, though navigational links may still appear. | Allow, unless you intentionally want to opt out of ChatGPT search visibility. |
GPTBot |
OpenAI crawler for content that may be used to improve generative AI foundation models. | Signals that site content should not be used for training generative AI foundation models. | Decide separately. It is not the crawler used to manage ChatGPT search opt-outs. |
ChatGPT-User |
User-triggered visits from ChatGPT or Custom GPT actions. | OpenAI says this user agent is not used to determine whether content appears in Search. | Do not use this token as the main search visibility control. |
OpenAI also notes that these crawler settings are independent. A webmaster can allow OAI-SearchBot for search while disallowing GPTBot for training-use preference. OpenAI also says robots.txt updates can take about 24 hours for its systems to adjust for search results.
| Site policy | OAI-SearchBot | GPTBot | Use when |
|---|---|---|---|
| Maximize discovery | Allow | Allow | You want broad OpenAI crawling and have no training-use restriction. |
| Search yes, training no | Allow | Disallow | You want ChatGPT search eligibility but do not want content used for foundation-model training. |
| Opt out of OpenAI crawling | Disallow | Disallow | You intentionally do not want ChatGPT search visibility or GPTBot training-use crawling. |
| Private or paid content | Do not rely on robots.txt | Do not rely on robots.txt | Use authentication, access control, and noindex where appropriate. robots.txt is not security. |
User-agent: OAI-SearchBot Allow: / User-agent: GPTBot Allow: / Sitemap: https://example.com/sitemap.xml
This pattern is the most open policy. It can be right for public marketing pages, product documentation, and open-source docs where the goal is maximum discovery and reuse.
User-agent: OAI-SearchBot Allow: / User-agent: GPTBot Disallow: / Sitemap: https://example.com/sitemap.xml
This second pattern keeps the search crawler open while opting out of GPTBot. It is often the best starting point for publishers, SaaS docs, and brand sites that want ChatGPT search visibility but need a more cautious training-use policy.
User-agent: OAI-SearchBot Disallow: / User-agent: GPTBot Disallow: / Sitemap: https://example.com/sitemap.xml
Use this only if you intentionally want to opt out of both OpenAI search crawling and GPTBot training-use crawling. If traffic from ChatGPT search matters, this is probably not the default you want.
OAI-SearchBot by accidentA broad User-agent: * block or copied anti-AI template can accidentally block the crawler tied to ChatGPT search visibility.
robots.txt tells cooperative crawlers what to access. It does not protect private files from users, links, logs, or non-compliant crawlers.
Run the audit checklistChatGPT-User as the search controlOpenAI documents ChatGPT-User for user-triggered actions, not for automatic search inclusion decisions.
Record the change date, submit your sitemap after launch, and monitor logs for OAI-SearchBot and GPTBot separately.
Analyze crawler logsrobots.txt rule at the root of the final domain.Sitemap: https://your-domain.com/sitemap.xml in robots.txt.OAI-SearchBot, GPTBot, and normal search crawlers.