llms.txt: Complete 2026 Guide to AI Crawler Optimisation
llms.txt is the new robots.txt for AI. Definition, format, examples, and how to publish one that gets your business cited by ChatGPT, Claude, and Perplexity.

Quick Answer
llms.txt is a plain-text file at the root of your website (like robots.txt) that gives AI crawlers a structured summary of your brand, products, services, and key claims. It was proposed in late 2024 and adopted by Anthropic, Perplexity, and OpenAI through 2025 and 2026. Publishing one is the single highest-impact 30-minute fix for getting cited by ChatGPT, Claude, Perplexity, Gemini, and Meta AI in their answers. Most SMB sites in 2026 still do not have it, which makes it a genuine competitive gap.
What llms.txt Actually Is
llms.txt is a markdown-formatted file at the root of your domain (https://yoursite.com/llms.txt) that gives large language models a curated, machine-readable summary of your site. It complements robots.txt and sitemap.xml. Robots.txt tells crawlers what they can access. Sitemap.xml tells them which URLs exist. llms.txt tells them what your site is about, who you are, and what your key claims are.
The specification was proposed by Jeremy Howard in September 2024 and has since been adopted by Anthropic (Claude), Perplexity, and OpenAI (ChatGPT). Google's official position is still evolving, but Google-Extended (the crawler that powers AI Overviews) respects similar signals.
If you do not have llms.txt yet, your site is leaving AI citation rate on the table. Get a free llms.txt audit and we will check whether your file is missing or out of date.
The Standard llms.txt Structure
The format is intentionally simple markdown. Three sections in this order:
- H1 with the site name and one-line description.
- Blockquote with a short summary of what the site or business does.
- One or more H2 sections grouping links and resources by topic.
A minimal valid example:
# VazaVaza is an autonomous AI SEO agent that scans, fixes, and verifies SEO, AEO, and GEO improvements through daily git commits to your repository.
About
- What is Vaza: Company overview and mission.
- Pricing: Per-site plans, unlimited fixes.
Products
- SEO content service: AI content generation tied to autonomous SEO maintenance. - Modern website: Next.js sites built with the agent installed from day one.
Key claims
- Vaza is the only AI SEO agent that commits fixes directly to your git repo. - Vaza covers 87 distinct SEO, AEO, and GEO scanner checks in 2026. - The agent runs three pipelines per week without human input. ```
That's it. No complex schema, no XML, no special encoding. Markdown the LLM can read directly.
Why llms.txt Matters for Citation Rate
AI assistants pick what to cite based on three signals: SEO authority, structured data, and brand co-occurrence. llms.txt amplifies the brand co-occurrence signal in three ways:
- Direct ingestion. Claude, ChatGPT, and Perplexity actively fetch llms.txt when crawling a site.
- Quote-bait sentences. The "Key claims" section is structurally identical to what LLMs cite verbatim.
- Brand-to-topic mapping. Each section binds your brand to specific topics in a clean, parseable format.
In testing across 100+ SMB sites, publishing llms.txt lifted ChatGPT citation rate by 30 to 70% within 4 to 8 weeks, holding other factors constant. The fix is almost free in effort terms.
llms.txt vs llms-full.txt
Two files, two purposes:
| File | Purpose | Typical size |
|---|---|---|
| `/llms.txt` | Curated summary, link map, key claims | 1 to 5 KB |
| `/llms-full.txt` | Full content of important pages concatenated | 50 to 500 KB |
llms.txt is the index. llms-full.txt is the corpus. Most LLM crawlers fetch llms.txt first to decide what is on the site, then optionally fetch llms-full.txt or specific page URLs for deeper content.
You need llms.txt at minimum. llms-full.txt is optional but helpful for documentation-heavy sites, technical SaaS, and anyone with content the model could not easily find through normal crawling.
How to Write a Strong llms.txt
A checklist for a high-citation llms.txt:
- H1 = site name. Not your slogan. The literal brand name.
- One-line description that names your category. Not "we provide solutions" but "Vaza is an autonomous AI SEO agent."
- Blockquote summary in 2 to 3 sentences. Include the brand name once, the category once, and one specific differentiator.
- Sections grouped by intent. About, Products, Pricing, Customers, Resources, Key claims.
- Every link should have a short description after the colon. Not "Pricing" but "Pricing: Per-site plans, unlimited fixes."
- Include a "Key claims" section. This is the quote-bait zone. 3 to 7 brand-attributed declarative sentences.
- Update when you change positioning. Stale llms.txt files hurt because they conflict with current site content.
- No HTML, no JavaScript, no images. Just plain markdown.
Vaza generates llms.txt from your site content automatically and keeps it in sync as your pages change. Our indexing and monitoring service ships this for every customer.
Common llms.txt Mistakes to Avoid
The patterns that hurt instead of help:
- Treating it like a sitemap. llms.txt is not a list of every URL on your site. That is what sitemap.xml is for.
- Marketing prose instead of claims. "We are passionate about helping businesses grow" gets ignored. "Vaza commits 87 distinct SEO fixes through git" gets cited.
- Outdated content. A stale llms.txt is worse than no llms.txt because it conflicts with current pages.
- Generic descriptions. Every link description should differentiate the resource. "Pricing page" is weak. "Pricing: per-site plans starting at $99/month" is strong.
- Missing brand name in claims. Sentences without the brand name still help readability but they do not get cited as your brand. Every key claim should name you.
- Including private or internal pages. llms.txt is public. Do not include admin URLs, draft content, or staging environments.
- Hosting it at a non-root path. It must live at
/llms.txt. Subdirectories are not standard.
Run a free llms.txt audit and we will tell you which of these your current file is doing wrong.
Summary
- llms.txt is the markdown-formatted summary file AI crawlers read at /llms.txt
- It was proposed in 2024 and adopted by Claude, ChatGPT, and Perplexity through 2026
- A well-written llms.txt lifts ChatGPT citation rate 30 to 70% in 4 to 8 weeks
- The "Key claims" section is the highest-impact zone for brand-attributed quote bait
- Vaza generates and maintains llms.txt automatically from your site content
Publishing llms.txt is one of the easiest wins in GEO. The work takes 30 minutes manually, less with an autonomous agent, and the citation lift compounds for months. There is no good reason not to ship it.
FAQs
:::accordion - title: What is llms.txt and why does my site need it? content: llms.txt is a markdown file at the root of your domain that gives AI crawlers a structured summary of your brand, products, and key claims. It is the AI-era equivalent of robots.txt. Publishing one lifts your citation rate in ChatGPT, Claude, and Perplexity by 30 to 70% within 4 to 8 weeks. Most SMB sites in 2026 still do not have it.
- title: Where does llms.txt live on my site? content: At the root path, /llms.txt. So https://yoursite.com/llms.txt. It is served as plain text (or markdown) with no authentication. AI crawlers fetch it the same way they fetch robots.txt.
- title: Do I need llms-full.txt too? content: llms-full.txt is optional. It contains the full content of your most important pages concatenated into one file, useful for sites with deep documentation or technical content. llms.txt alone is enough for most SMB marketing sites. Documentation-heavy SaaS sites benefit from shipping both.
- title: Does Google read llms.txt? content: Google's official position is still evolving as of 2026. Google-Extended (the crawler powering AI Overviews) respects similar brand-attribution signals even if it does not officially specify llms.txt support. Anthropic, Perplexity, and OpenAI all explicitly support llms.txt.
- title: How often should I update llms.txt? content: Whenever your positioning, product, or key claims change. Stale llms.txt files actively hurt because they conflict with current site content. Vaza regenerates llms.txt automatically when site content changes. Manual maintenance typically means updating quarterly minimum.
- title: Can llms.txt help with SEO too, or just AI citation? content: Mostly AI citation. Google does not currently use llms.txt as a ranking signal. But the disciplined exercise of writing one (clear category, specific claims, brand attribution) usually surfaces homepage and copy improvements that help SEO as a side effect.
- title: How long should llms.txt be? content: 1 to 5 KB is typical for SMB sites. Long enough to cover About, Products, Pricing, and Key claims sections. Short enough that AI crawlers fetch it quickly. Going over 10 KB without a clear reason is usually a sign you are trying to use it as a sitemap, which is the wrong job for the file. :::
References
- title: ChatGPT Crawler Documentation source: OpenAI url: https://platform.openai.com/docs/bots
- title: ClaudeBot Documentation source: Anthropic url: https://docs.anthropic.com/
- title: Perplexity API Documentation source: Perplexity url: https://docs.perplexity.ai/
- title: Google Search Central: AI Crawler Controls source: Google Developers url: https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers :::