Artiql

llms.txt Explained: Do AI Crawlers Actually Use It?

Quick answer: llms.txt is a proposed convention: a plain Markdown file at your domain root that gives AI models a curated map of your most important pages. Introduced in 2024, it aims to help large language models navigate your site. But as of 2026, the major AI crawlers largely ignore it. Structured data, clear entities and genuinely useful content remain the real levers for earning AI citations.

What is llms.txt, and where did it come from?

llms.txt is a plain-text Markdown file you place at the root of your domain, at yourdomain.com/llms.txt. It was proposed in late 2024 by Jeremy Howard, co-founder of Answer.AI, as an open convention to help large language models read and use a website more easily. Think of it as a curated index: instead of leaving an AI to crawl every menu, script and layout, you hand it a tidy summary of what matters most and where to find it.

The motivation is genuinely technical. AI models work within limited context windows, and pulling clean meaning out of cluttered HTML — navigation, ads, cookie banners, JavaScript — is hard and wasteful. By offering a concise, machine-friendly map in Markdown, llms.txt is meant to reduce that friction and point models toward your canonical, high-value content rather than an outdated blog post buried three clicks deep.

It's often compared to robots.txt or an XML sitemap, and the analogy is roughly fair in spirit. But the intent is different. robots.txt tells crawlers what they may access; llms.txt tries to tell models what's worth reading and how your site is organized. It's a suggestion about meaning, not a rule about permission.

What does an llms.txt file actually look like?

The format is deliberately simple, which is part of its appeal. The official specification asks for an H1 with your project or site name — the only strictly required element — followed by a blockquote summary that captures the essentials in a sentence or two. After that you can add a few paragraphs or lists for context, then organize links into H2 sections, each link written as a standard Markdown link with an optional note describing the page.

Order matters more than you might expect. Models read top-down and tend to weight earlier sections more heavily, so lead with your most authoritative, canonical pages. There's also a special convention: a section titled "Optional" signals content that can be safely skipped when a shorter context is needed. Where possible, the spec suggests linking to clean Markdown versions of pages, since they parse more reliably than full HTML.

Keep it lean — most useful files stay well under 500 words. A companion file, llms-full.txt, can hold the complete documentation library, while llms.txt itself stays a slim navigational index. After publishing, confirm it loads as plain Markdown at your root with a 200 status, and check that every link resolves cleanly.

Element	Required?	Purpose
H1 title	Yes	Names your site, project, or brand
Blockquote summary	Recommended	One- or two-sentence overview of what the site is
Context paragraphs/lists	Optional	Explain how the site is organized (no extra headings)
H2 sections with links	Recommended	Group key pages as Markdown links with short notes
"Optional" section	Optional	Marks links models may skip for a shorter context

The core structure of an llms.txt file, following the official specification.

Do ChatGPT, Claude and Perplexity actually use llms.txt?

This is where expectations meet reality. Despite the buzz in marketing circles, independent server-log studies through 2025 found that the major AI crawlers essentially don't request the file. In one analysis spanning mid-August to late October 2025, the llms.txt page received zero visits from GPTBot, PerplexityBot, ClaudeBot or Google's AI crawler. A separate audit of CDN logs across 1,000 domains saw no LLM-specific bots fetching it at all, while Google's ordinary desktop crawler accounted for the overwhelming majority of hits.

A 90-day study reinforced the pattern: even with a correctly implemented file, dedicated AI bots used the llms.txt entry point in roughly 0.1% of their visits. Adoption on the publishing side is tiny too — only around a thousand domains had published one by mid-2025. No major provider has confirmed it reads these files when generating answers, though a handful, including Anthropic, have published their own, which hints at openness rather than active use.

The honest summary: implementing llms.txt is harmless and low-effort, but there is no solid evidence the big answer engines treat it as a meaningful signal today.

~0.1%

of AI bot visits

Used the llms.txt entry point in a 90-day study

crawler hits

From GPTBot, ClaudeBot or PerplexityBot in a multi-month log test

~951

domains

Had published an llms.txt file as of mid-2025

What 2025 server-log research found about AI crawler behavior toward llms.txt.

Is llms.txt a Google ranking or GEO lever?

No — and this is the most common misunderstanding. Many marketers reframed llms.txt as a fresh GEO trick, a way to muscle into AI Overviews or get cited more often by ChatGPT and Perplexity. That was never the proposal's purpose. It was designed to help models read your site, not to manipulate rankings or citations. Treating it as a visibility hack mostly leads to disappointment.

Google has been unusually blunt here. In mid-2025, Gary Illyes stated that ranking in AI Overviews just needs normal SEO and that Google does not support llms.txt and has no plans to. John Mueller went further, comparing it to the long-ignored keywords meta tag — a signal search engines abandoned precisely because the site owner controls it, which makes it trivial to game and therefore untrustworthy.

That comparison is the crux. Any signal a publisher fully controls and can stuff in their favor tends to get discounted. Real authority in AI answers is earned through content quality, third-party mentions and verifiable expertise — not through a self-declared file.

What structured data actually influences AI citations today?

If llms.txt isn't moving the needle, what is? The strongest evidence points to structured data — specifically JSON-LD schema markup that describes your content in machine-readable terms. Unlike a self-declared index, schema annotates content the user can actually see, which makes it harder to fake and more trustworthy to consumers. Platforms have confirmed they use it: a Microsoft engineering lead stated publicly in early 2025 that schema markup helps its models understand content, and independent tests have shown ChatGPT, Claude, Perplexity and Gemini all parse schema when they fetch a page.

The schema types with the clearest citation evidence are FAQPage, Article, Product/Offer, Organization and Person. Just as important is connecting them into a coherent entity graph using @id and sameAs, so engines understand that the author, the company and the product are related, identifiable things. This entity clarity — not keyword density — is what answer engines increasingly reward.

Schema is context, not a magic switch. It helps AI parse and attribute your content accurately, but it can't manufacture credibility. Pair it with concise answer capsules of 40–60 words under question-style headings, visible authorship and dates, and anchor IDs so engines can cite specific sections.

Schema type	What it marks up	Why AI engines value it
FAQPage	Question-and-answer blocks	Maps directly to how answer engines lift and quote responses
Article	Headline, author, published/modified dates	Signals freshness, authorship and E-E-A-T
Product / Offer	Items, pricing, availability	Powers product answers and comparison results
Organization / Person	Brand and author identity	Builds the entity graph engines use to attribute content

Schema types with the strongest evidence for earning AI citations, using JSON-LD.

Should you publish an llms.txt file anyway?

For most sites, yes — with realistic expectations. The file is cheap to create, won't harm your SEO, and gives you a tidy place to declare your most important content. There's also a real, if narrow, use case today: if your audience includes developers using tools like Cursor, Windsurf or Claude Code with your API documentation, a well-structured llms.txt acts as a genuine entry point. That's close to Howard's original intent, and it works.

There's a modest future-proofing argument too. Because the same web crawlers that index your pages can surface them in AI-driven search, a clean, indexable llms.txt occasionally ranks for organic terms on its own. And if providers do adopt the convention more seriously, you'll already be ready. The risk isn't the file itself — it's mistaking it for the work that actually drives visibility.

So treat it as a low-priority hygiene item, not a strategy. Publish it, keep it accurate, and spend your real effort on content, schema and entities.

Pros

+Low effort and harmless to your existing SEO
+Genuinely useful for developer tools and API documentation
+Cleanly declares your most important, canonical pages
+Future-proofs your site if adoption grows

Cons

−Major AI crawlers largely ignore it today
−Not a ranking or GEO lever — Google won't use it
−Adds a file to maintain and keep accurate
−Can create a false sense of real progress

Weighing whether to add an llms.txt file in 2026.

How should you actually optimize for AI answer engines?

The durable playbook is less exotic than a single magic file, and that's good news. Start with genuinely useful content that answers real questions directly — clear, well-structured pages with question-style headings and concise summaries that an engine can lift and quote. Layer JSON-LD schema on top to make that content machine-readable, and connect your entities so engines know who wrote it and what brand stands behind it. Then earn trust the old-fashioned way, through quality and credible third-party mentions.

Crucially, remember that Google rankings and AI citations are separate games. You can rank first and still go uncited, because answer engines weigh structure, entity clarity and trustworthiness differently. Building for both means writing for humans, marking up for machines, and keeping everything fresh and accurate across the languages your audience actually uses.

This is exactly what Artiql automates: multilingual SEO- and GEO-ready articles with built-in structure and schema, an AI video per piece for YouTube and social, a review queue, and publishing to your own domain. If you'd like to see it work on your content, book a demo and we'll walk you through it.

Frequently asked questions

Is llms.txt the same as robots.txt?

No. robots.txt controls which crawlers may access which parts of your site — it's about permission. llms.txt is about meaning: it offers AI models a curated Markdown map of your most important content and how it's organized. robots.txt is a long-established, widely respected standard, while llms.txt remains an emerging proposal that major AI crawlers largely ignore today. They serve different purposes and don't replace each other.

Where do I put the llms.txt file?

Place it at the root of your domain so it's reachable at yourdomain.com/llms.txt, exactly like robots.txt. The filename must be precisely llms.txt — not llm.txt — for compatibility. Write it in plain Markdown using a plain-text editor, then verify it loads with a 200 status and displays as clean Markdown. Check that every link resolves without 404s or redirect chains before you consider it published.

Does Google use llms.txt for AI Overviews?

No. Google representatives have stated plainly that the company does not support llms.txt and has no plans to, and that ranking in AI Overviews just needs normal SEO. One Google representative compared it to the long-ignored keywords meta tag, because signals fully controlled by the site owner are easy to manipulate and therefore untrustworthy. For Google's AI experiences, focus on quality content, structured data and entity clarity instead.

What is llms-full.txt?

llms-full.txt is a companion file to llms.txt. Where llms.txt is a slim navigational index — a summary plus links to key pages — llms-full.txt is the complete library, bundling full documentation into a single Markdown file for models that want everything in one place. Both use Markdown for its natural, parse-friendly hierarchy. Like llms.txt, it's mainly useful for developer documentation today, since broad AI crawler adoption remains limited.

Will adding llms.txt hurt my SEO?

No. Publishing an llms.txt file is harmless to your existing search rankings — it's a low-effort, low-risk addition. The real danger is opportunity cost: mistaking it for a visibility strategy and neglecting the work that actually drives AI citations. Treat it as a minor hygiene item, keep it accurate, and invest your meaningful effort in high-quality content, JSON-LD schema markup and a clear entity structure, which have demonstrable impact.

llms.txt Explained: Do AI Crawlers Actually Use It?

Put your organic marketing on autopilot

What is llms.txt, and where did it come from?

What does an llms.txt file actually look like?

Do ChatGPT, Claude and Perplexity actually use llms.txt?

Is llms.txt a Google ranking or GEO lever?

What structured data actually influences AI citations today?

Should you publish an llms.txt file anyway?

How should you actually optimize for AI answer engines?

Frequently asked questions

Put your organic marketing on autopilot