Target keyword: Google-Extended vs Googlebot · verified 2026-06-24 HKT

Google-Extended vs Googlebot: avoid blocking the crawler you actually need

Googlebot and Google-Extended sound similar, but they do different jobs. Googlebot is the crawler token that affects Google Search and related search surfaces. Google-Extended is a separate robots.txt control for certain Gemini and Vertex AI use cases, not a Google Search ranking signal.

Fast recommendation: if you want Google Search, AI Overviews, AI Mode, Discover, Images, News, or other Google Search surfaces to keep seeing your public pages, do not block Googlebot. If your policy allows Search but limits certain Gemini/Vertex AI uses, use a separate Google-Extended rule.

Quick comparison

Token Documented role What blocking can affect Default for public SEO
Googlebot Google's main crawler token for Google Search and related products. Google Search, Discover, Google Images, Google Video, Google News, and other Search features. Allow, unless a URL should not be crawled or indexed.
Google-Extended Standalone product token for managing whether content crawled by Google may be used for Gemini model training and grounding in Gemini Apps / Vertex AI. Specified Gemini and Vertex AI uses. Google says it does not impact Search inclusion and is not a Search ranking signal. Decide separately from Googlebot based on AI-use policy.
Google-InspectionTool Search testing tools such as Rich Results Test and URL Inspection. Search testing tools, not Google Search itself. Usually allow for diagnostics.
GoogleOther Generic crawler used by various Google product teams for public content fetches. No specific product effect documented in the same way as Googlebot. Review separately if you manage a strict crawler policy.

If you want Google Search traffic

Keep Googlebot able to crawl your public pages. Google's guidance for AI features says normal SEO fundamentals still apply to AI Overviews and AI Mode, and eligible supporting links need to be indexed and eligible for snippets.

User-agent: Googlebot
Allow: /

Sitemap: https://example.com/sitemap.xml

This is the safest default for public marketing sites, SaaS docs, local business pages, ecommerce category pages, and any content that depends on Google Search discovery.

If you want to opt out of Google-Extended

You can express a separate rule for Google-Extended while keeping Googlebot open. Confirm current Google documentation before publishing, especially if your business relies on search traffic.

User-agent: Googlebot
Allow: /

User-agent: Google-Extended
Disallow: /

Sitemap: https://example.com/sitemap.xml

This is the common "Search yes, AI-use control no" pattern. It keeps Googlebot open for Search while using the standalone Google-Extended token for the Gemini/Vertex AI policy choice.

If you want a strict Google crawl opt-out

User-agent: Googlebot
Disallow: /

User-agent: Google-Extended
Disallow: /

Sitemap: https://example.com/sitemap.xml

Use this only when you intentionally do not want normal Google Search crawling for the affected URLs. For most public traffic funnels, this is not the right default.

Decision matrix

Goal Googlebot Google-Extended Best next step
Maximize Google Search and AI-search visibility Allow Allow Focus on crawlable HTML, internal links, sitemap, structured data that matches visible content, and useful pages.
Keep Google Search, limit certain Gemini/Vertex AI uses Allow Disallow Publish the split rule, record the policy date, and monitor Googlebot separately from Google-Extended control intent.
Hide private or restricted content Do not rely on robots.txt Do not rely on robots.txt Use authentication, password protection, access control, or noindex where appropriate.
Reduce crawl load on low-value URLs Selective rules only Separate AI-use decision Block faceted, duplicate, or low-value paths carefully without blocking important public pages.

Common mistakes

Blocking User-agent: * too broadly

A blanket disallow can remove normal Googlebot crawling along with bots you meant to restrict.

Check your robots.txt

Expecting Google-Extended to control Search

Google says Google-Extended does not impact Search inclusion and is not a Search ranking signal.

View crawler tokens

Using robots.txt as privacy

Google says robots.txt is not a mechanism for keeping a page out of Google. Use authentication or noindex where appropriate.

Run the audit checklist

Trusting only user-agent strings

User-agent strings can be spoofed. Verify serious Googlebot claims with reverse DNS or Google's published IP ranges.

Analyze crawler logs

Launch checklist

  1. Choose whether your policy is "Search open" or "Search open, Google-Extended restricted".
  2. Keep important public pages available to Googlebot.
  3. Add Sitemap: https://your-domain.com/sitemap.xml to robots.txt.
  4. Run the AI crawler robots.txt checker before publishing.
  5. After launch, monitor server logs for Googlebot, Google-InspectionTool, and suspicious spoofed user agents.
  6. Use Search Console to submit the sitemap and inspect important URLs once the final domain is live.

Sources

Applied to this funnel: LLMs.txt Kit keeps Googlebot open in its recommended traffic-funnel policy, then treats Google-Extended as a separate AI-use control. That lets the funnel preserve Google Search discovery while documenting model-use preferences clearly.