Artiql

How to Add llms.txt So AI Crawlers Quote Your Site

Quick answer: An llms.txt file is a Markdown document placed at your domain root that gives AI assistants a curated map of your most important pages. To add one, create yourdomain.com/llms.txt, open with a title and one-line summary, then list high-value URLs grouped by topic with short descriptions. It helps supporting crawlers parse and cite you faster, though adoption is still uneven across AI engines.

What is an llms.txt file, and why should you care?

Think of llms.txt as a hand-curated welcome mat for AI assistants. It is a plain Markdown file you place at the root of your site, and its job is simple: tell language models which pages matter most. Where a sitemap.xml says "here is every URL we have," llms.txt says "here are the pages we want AI engines to treat as authoritative, ranked by importance." That distinction is the whole point.

The format was proposed in late 2024 to solve a real bottleneck. AI assistants need a fast way to understand a site without crawling every corner of it, and llms.txt hands them a concise, owner-curated entry point. Instead of guessing which blog post or doc page best answers a question, a supporting crawler can read your short list and go straight to the good stuff.

For anyone building organic visibility, that matters because AI answer engines increasingly sit between your content and your audience. If ChatGPT, Claude, or Perplexity can quickly grasp what your site is about and which pages to quote, you raise your odds of being represented accurately rather than paraphrased from a thin or outdated page.

What does an llms.txt file actually look like?

The file is written in standard Markdown and lives at one fixed location: yourdomain.com/llms.txt, in the same folder as your robots.txt and sitemap.xml. That predictable path is what lets supporting crawlers discover it automatically, no registration required.

The structure follows a loose but consistent order. You start with an H1 project or brand name, follow it with a one-line blockquote summary that captures what you do, then add optional detail and one or more lists of links. Each link is a page URL paired with a short, plain-language description of what the reader will find there. The descriptions do real work, because they tell the model why a page is worth citing.

For larger sites, you can group links under H2 sections like "Documentation," "Guides," or "Products," and keep the whole file tight. There is also an extended variant, llms-full.txt, which bundles the full Markdown body of your linked pages into a single fetch for assistants that want to ingest everything at once.

Element	Markdown	Purpose
Title	# Brand or project name	Identifies whose site this is
Summary	> One-line description	Gives instant context to the model
Section heading	## Guides	Groups related links by topic
Link entry	- [Page title](url): short note	Points to a high-value page and why it matters

A minimal llms.txt structure, element by element.

How do you create an llms.txt file step by step?

Start by creating a plain text file named llms.txt. Open it with your brand name as an H1, then write a single blockquote sentence that sums up what you offer and for whom. Keep that summary specific and concrete, since it is often the first thing a model reads about you.

Next, inventory your best content. Pull together your strongest product pages, cornerstone guides, documentation, and resource hubs, then write a one-line description for each. Group them under clear H2 headings so the file scans well. Resist the urge to dump everything in: a focused list of pages you would be proud to be quoted from beats an exhaustive one.

Finally, upload the file to your domain root so it resolves at yourdomain.com/llms.txt, right alongside robots.txt. Test the URL in a browser to confirm it loads as plain text. If your site changes often, treat llms.txt as a living document and revisit it whenever you publish or retire major pages.

What should you include — and leave out — of llms.txt?

The guiding rule is curation, not coverage. Include indexable, high-value pages: key product or service pages, cornerstone blog posts, documentation, pricing or overview pages, and anything you would genuinely want an AI assistant to cite as your authoritative source. Each entry should earn its place with a crisp description.

Just as important is what you keep out. Thin pages, duplicates, drafts, tag archives, and low-signal utility pages only dilute the file and muddy the model's understanding of what you stand for. A tighter file sends a cleaner signal. For most sites, grouping links by topic and keeping the file comfortably under a couple hundred lines keeps it both readable and parseable.

If you have a deep knowledge base, that is exactly where llms-full.txt earns its keep. Use the short llms.txt as the curated front door, and reserve the long-form file for assistants that want to ingest your entire documentation set in one pass.

Pros

+Cornerstone guides and pillar articles
+Core product, service, or pricing pages
+Documentation and API references
+Concise, accurate one-line descriptions

Cons

−Thin, duplicate, or near-empty pages
−Drafts, staging, and archived content
−Tag and pagination URLs
−Vague descriptions that could fit any page

What belongs in llms.txt — and what to skip.

Do ChatGPT, Claude, and Perplexity really use llms.txt?

Here honesty serves you better than hype. Adoption is genuinely uneven, and the evidence is mixed. Perplexity has been the clearest supporter, reportedly retrieving llms.txt to help prioritize which pages to surface. Claude is said to respect it in some retrieval workflows, while OpenAI's use is unconfirmed and only inferred from observed behavior. Google has explicitly said it has no plans to use llms.txt for Search, Gemini, or AI Overviews.

Measurement studies add a sobering note. One 90-day log analysis found that llms.txt files received a tiny fraction of AI bot requests, and a large multi-domain study concluded the file does not measurably improve your odds of being cited today. The format remains a community convention rather than a ratified standard, so no engine is obligated to honor it.

So why bother? Because the cost is near zero and the upside is real optionality. Notably, the major AI labs publish their own llms.txt files for their docs, and the exercise forces you to write a clean inventory of what you most want quoted — value that survives regardless of which crawler adopts the standard next.

600+

Sites adopted by mid-2025

Including Perplexity, Anthropic, Stripe, and Cloudflare

0.1%

Share of AI bot hits to llms.txt

From a 90-day server-log measurement study

Google products using it

Google has stated it has no plans to consume llms.txt

Three numbers that frame the honest state of llms.txt adoption.

How do you know if your llms.txt is actually working?

Skip the guesswork and go to your server logs. Filter for requests to /llms.txt and watch for named AI agents such as PerplexityBot, ClaudeBot, and GPTBot. If those crawlers are fetching the file, you have hard evidence it is being discovered, even before any citation shows up.

Next, watch your AI citations directly. Ask the major assistants questions your content should answer, and note which pages they reference and how accurately they describe you. Track this over time, because the meaningful signal is a shift in which pages get quoted and whether your positioning is represented faithfully — not a single lucky mention.

Set expectations accordingly. Because real-world impact is still modest and varies by engine, treat llms.txt as one low-risk input within a broader AI-search strategy, not a magic switch. Pair it with genuinely useful, well-structured content, since that is what every answer engine actually rewards.

How do you keep llms.txt in sync as your content grows?

A stale llms.txt quietly undoes its own purpose. If it points to retired pages or misses your newest cornerstone guide, you are handing AI assistants an outdated map. The fix is to fold maintenance into your publishing routine: whenever you ship or sunset a major page, update the file in the same breath, just as you would a sitemap.

This is where automation pays off, especially for fast-moving SaaS sites, ecommerce catalogs, and agencies juggling many clients. Generating llms.txt from your live content inventory — and regenerating it on a schedule — removes the manual drift that creeps in when a human has to remember. The same discipline that keeps your sitemap honest keeps your AI front door honest.

If your team would rather not babysit any of this, that is the gap Artiql is built to close. It runs your organic content as an autopilot — creating multilingual SEO and GEO articles, keeping machine-readable signals current, and routing everything through a review queue and your own headless CMS. Want to see it on your site? Book a demo and we will walk you through it.

Frequently asked questions

Is llms.txt the same as robots.txt?

No. They serve opposite goals. Robots.txt tells crawlers which paths they may or may not access — it is about permission and exclusion. The llms.txt file is invitational: it curates and highlights the pages you most want AI assistants to read and cite. They live in the same root folder and follow a similar discovery convention, but one restricts access while the other guides attention toward your best content.

Will adding llms.txt help me rank on Google?

Not directly. Google has publicly stated it has no plans to use llms.txt as an input for Search, Gemini, or AI Overviews, so it will not change your traditional rankings. Its potential value sits with AI answer engines that support the convention, such as Perplexity. Keep treating strong content, technical health, and a clean sitemap as your core ranking levers, and view llms.txt as a complementary AI-search experiment.

How long should an llms.txt file be?

Keep it tight. For most sites, a focused list grouped by topic and kept under a couple of hundred lines is ideal, because curation is the entire value proposition. A short file of genuinely high-value pages sends a far cleaner signal than an exhaustive dump. If you need to expose a large knowledge base in full, use the separate llms-full.txt variant for assistants that want the long-form content in one fetch.

Do I need both llms.txt and llms-full.txt?

Usually not at first. Start with llms.txt as your curated front door — a concise list of important pages with short descriptions. Add llms-full.txt only if you have substantial documentation or a knowledge base you want AI assistants to ingest in a single pass, since it bundles the full Markdown body of your pages. Many sites are well served by the short file alone, especially while adoption of the format is still maturing.

Can I generate llms.txt automatically?

Yes. You can write it by hand, but several generators and platforms can build the file from your live content and refresh it on a schedule. Automation is especially helpful for sites that publish often, because it prevents the file from drifting out of sync with your real pages. The key is to keep the output curated — automated or not, the file should still highlight only your high-value, citation-worthy content.

How to Add llms.txt So AI Crawlers Quote Your Site

Put your organic marketing on autopilot

What is an llms.txt file, and why should you care?

What does an llms.txt file actually look like?

How do you create an llms.txt file step by step?

What should you include — and leave out — of llms.txt?

Do ChatGPT, Claude, and Perplexity really use llms.txt?

How do you know if your llms.txt is actually working?

How do you keep llms.txt in sync as your content grows?

Frequently asked questions

Put your organic marketing on autopilot