What does agentics check?

Three dimensions in one report: (1) AI-agent readiness — robots.txt rules for GPTBot/ClaudeBot/PerplexityBot/GoogleExtended, llms.txt, JSON-LD coverage, OpenGraph, SSR vs JS-rendered content, .well-known/{mcp,ai-plugin,security.txt}, OpenAPI discovery, sitemap-depth analysis per template. (2) Email deliverability — SPF, DKIM, DMARC, MX, STARTTLS, DNSBL (Spamhaus ZEN), MTA-STS, TLS-RPT, BIMI, DNSSEC. (3) LLM grounding — does Claude, ChatGPT, and Gemini know your brand.

The basic technical audit (DNS, mail config, agent-readiness, SEO bones) is free. Pro signals — LLM grounding probes, MCP live-testing, force-refresh, permanent share URLs — are €2 for a single check or €9 for a 100-credit pack with no expiration.

How is this different from Cloudflare’s Agent Readiness Score?

Cloudflare’s tool (isitagentready.com) only scores agent-readiness signals. agentics additionally checks email deliverability, runs actual LLM grounding probes against Claude/ChatGPT/Gemini, and roundtrip-tests your MCP server using real OpenAI and Anthropic clients.

Do you store the domains I check?

We keep a short anonymous log of checked domains — just the domain, a timestamp, and whether the check was free or paid (no IP, no personal data) — to understand what people audit and to prevent abuse. Full reports are also cached server-side so a shareable link keeps working. We never sell this data or use it for marketing.

Common Crawl

CCBot— crawlers from Common Crawl

Open public web archive used by many LLM training datasets.

CCBot is the crawler behind Common Crawl, the open-data web archive that underpins most public LLM training datasets (including parts of GPT-3, LLaMA, Mistral, Falcon and others). If you disallow CCBot, you opt out of nearly the entire open-research LLM training pipeline at once.

Vendor

Common Crawl

robots.txt snippets

Allow

User-agent: CCBot
Allow: /

Disallow

User-agent: CCBot
Disallow: /

FAQ

What is CCBot?

What is the user-agent string for CCBot?

CCBot identifies itself with the user-agent token "CCBot". You can match it in robots.txt with "User-Agent: CCBot" and route nginx / log-analyzer rules against that token.

How do I allow CCBot in robots.txt?

Add the following block to your /robots.txt — this explicitly grants CCBot access: User-agent: CCBot Allow: /

How do I block CCBot in robots.txt?

Add the following block to your /robots.txt — note that well-behaved bots honor this, but not every crawler does: User-agent: CCBot Disallow: /

How can I check whether my site is ready for CCBot?

Run a free check at https://agentics.page — it audits whether your robots.txt allows the right bots, whether you publish llms.txt and JSON-LD structured data, whether your content is server-rendered, and whether CCBot can actually consume your site.

Is your domain ready for CCBot?

agentics checks whether your robots.txt allows the right bots, your llms.txt is in shape, your JSON-LD and SSR content are visible, and whether CCBot can actually use your domain.

Run free check →

Related agents

GPTBotOpenAI

Crawls the web for training ChatGPT models.

OAI-SearchBotOpenAI

Indexes pages for ChatGPT Search results.

ClaudeBotAnthropic

Crawler used to train Claude models.

anthropic-aiAnthropic

Legacy user-agent still used in places.