llms.txt: what it is and how to add one
An llms.txt guide for busy teams. What the llms.txt file is, whether it helps AI crawlers, and how to write and host one for your site in a few minutes.
An llms.txt guide for busy teams. What the llms.txt file is, whether it helps AI crawlers, and how to write and host one for your site in a few minutes.

An llms.txt file is a plain Markdown file you host at /llms.txt that hands AI crawlers a clean, curated map of your site's most important pages. It is not styled for people. It is a shortlist a language model can read in one pass: your site name, a one-paragraph summary, and a few sections of links with short descriptions. The idea, proposed by Jeremy Howard in 2024, is simple. Models work better with context than with raw HTML, so give them the context directly.
This guide covers what goes in an llms.txt file, whether it actually moves the needle, how it differs from robots.txt and a sitemap, and how to write and host one in a few minutes.
llms.txt is a Markdown file at the root of your domain that points AI models at the pages that matter, with a sentence of context on each. A browser will render it as text. A model reading it gets a quick, structured sense of what your site is and where the good parts live, without crawling and parsing every page first.
The format is loose but conventional. A well-formed file has four parts:
Because it is just Markdown, it reads the same way structured content reads to a model: predictable headings, plain prose, no markup noise. That predictability is the point. The same instinct drives good answer engine optimization: answer the question plainly, structure it cleanly, and machine readers pick it up. An llms.txt file applies that thinking at the level of your whole site instead of a single page.
It pairs with the technical hygiene that already helps you rank. If you have worked through SEO for SaaS, an llms.txt file is a small, fast addition to that same checklist, not a replacement for it.
Honest answer: it is an emerging convention, not a ranking guarantee. No major AI provider has publicly confirmed that an llms.txt file changes how they rank, retrieve, or cite your pages. Adoption among the big crawlers is still uneven, and some teams treat it with healthy skepticism for exactly that reason.
So why add one? A few reasons that hold up even without a confirmed ranking benefit:
Treat it the way you would treat clean structured data or a tidy sitemap: good practice with a low ceiling on downside, not a magic lever. The win is not the file alone. It is the habit of writing for both people and the models that increasingly sit between people and your site. The same logic is why we fact-check every claim and link in a post before it ships: if a model is going to cite you, the page it cites had better be accurate and easy to read.
If you want to see a live one, Lyra serves her own at trylyra.ai/llms.txt. It is a small, real example of the format described here.
They live in the same place and they all talk to crawlers, but they do different jobs. Mixing them up is the most common confusion, so here is the split.
| File | What it does | Who it is for |
|---|---|---|
robots.txt | Sets rules: which paths crawlers may or may not access. | Crawlers deciding what to fetch. |
sitemap.xml | Lists every URL so crawlers can discover them all. | Search engines indexing the full site. |
llms.txt | Curates your best pages with plain-language descriptions. | Language models that want context fast. |
A sitemap is exhaustive. It wants to list everything. An llms.txt file is the opposite: it is opinionated and short, a highlight reel of the pages you would actually want an AI to read and quote. robots.txt is about permission, not content. None of the three replaces the others. You can and usually should have all three.
One nuance worth keeping straight. Some teams also publish full-text versions of pages at paths like page.md or an llms-full.txt that inlines whole documents. That is an extension, not a requirement. The core llms.txt file is just the map.
Start with the pages you would hand a new hire on day one. Your homepage, your core product or pricing pages, your docs, and a handful of your strongest blog posts. Skip thin pages, duplicates, and anything expired. Quality over coverage. A model gets more from 15 well-described links than from 150 bare URLs.
Then write the file. Here is a minimal example you can adapt:
# Acme Analytics
> Acme Analytics is a privacy-first product analytics tool for small SaaS
> teams. We help founders see which features drive retention without
> shipping user data to third parties.
Self-serve, no credit card to start. Docs are public and versioned.
## Product
- [Overview](https://acme.com/): What Acme does and who it is for.
- [Pricing](https://acme.com/pricing/): Plans, limits, and the free tier.
- [Integrations](https://acme.com/integrations/): Supported sources and SDKs.
## Docs
- [Quickstart](https://acme.com/docs/quickstart/): Install and send your first event.
- [API reference](https://acme.com/docs/api/): Endpoints, auth, and rate limits.
## Blog
- [Retention metrics that matter](https://acme.com/blog/retention/): The four numbers we track.
- [Self-hosting Acme](https://acme.com/blog/self-host/): Run it on your own infra.A few rules that keep it useful:
The file has to resolve at your domain root, at https://yourdomain.com/llms.txt, and return plain text or Markdown. How you get it there depends on your stack.
llms.txt in your public/ or root output directory. Most static hosts serve it as-is.text/plain or text/markdown content type.Once it is live, open the URL in a browser and confirm you see your raw Markdown, not a 404 and not an HTML page wrapping it. That is the whole job. There is no registry to submit to and no verification step.
An llms.txt file is one small piece. It makes your site easy to map. It does not write the pages that make the map worth reading. Those still come from clear, accurate, well-structured content that answers real questions, which is what gets cited whether or not a crawler reads your llms.txt first. The file is a pointer; the pages are the payload.
This is where the work scales or stalls. Maintaining the file is trivial. Producing a steady stream of pages worth pointing to is the hard part, and it is exactly the gap Lyra fills. Lyra is an autonomous writer who finds topics worth covering, drafts them in your blog's existing voice, fact-checks the claims and links, scores the draft, and opens a GitHub pull request for you to review. Nothing publishes on its own. You bring your own Anthropic key, and Lyra is in early access while we build in the open, so talk to the founder to see whether she's a fit. If you want the deeper version of how an autonomous AI blog writer works end to end, the pillar page walks through it.
Add the llms.txt file this afternoon. It is five minutes of work with no real downside. Then spend your energy on the content it points to, because that is what actually earns the citation.
Lyra ships the kind of clean, structured posts that AI answer engines like to cite, and she serves her own llms.txt so the crawlers find them fast.
Step by step
List your most important pages
Pick the 10 to 30 pages you would want an AI to read first: your homepage, core product pages, pricing, docs, and your best blog posts. Skip thin, duplicate, or expired URLs.
Write a one-paragraph site summary
In a blockquote under the H1, say what your site is and who it serves in two or three plain sentences. This is the context a model reads before it looks at any link.
Group links into labelled sections
Use H2 headings like Docs, Product, and Blog. Under each, list links as Markdown bullets with a short description after a colon so a model knows what each page covers.
Save it as /llms.txt at your domain root
Serve the file as plain text or Markdown at https://yourdomain.com/llms.txt. Confirm it loads in a browser, then keep it updated when you add or retire major pages.
FAQ
llms.txt is a plain Markdown file served at the root of your domain, at /llms.txt. It gives AI crawlers and language models a short, curated map of your most important pages: an H1 with your site name, a blockquote summary, and labelled sections of links with one-line descriptions. It is meant to be read by machines, not styled for humans.
Be honest: it is an emerging convention, not a ranking guarantee. No major AI provider has confirmed it changes how they rank or cite you, and adoption is still uneven. What it does is make your key pages easy to parse and signal that you take machine readers seriously. It costs a few minutes to add and almost nothing to maintain, so the downside is tiny.
robots.txt tells crawlers what they may and may not access. sitemap.xml lists every URL so crawlers can find them all. llms.txt does neither: it is a short, opinionated shortlist of your best pages with plain-language descriptions, written for language models that want context fast, not a full index or a permission file.
Save it as llms.txt and serve it at your domain root, so it resolves at https://yourdomain.com/llms.txt. It must be reachable at that exact path and return plain text or Markdown. If your site is static, drop it in the public or root directory; if it is a framework, add a route or a static file that serves it.
Built by the tool you're reading about
Lyra finds the topics worth ranking for, writes them in your repo's voice, fact-checks every claim, and opens a pull request scored and ready to merge. You review and hit merge. Want to see what she'd write for you? Tell us about your blog and the founder will walk through it with you.
Keep reading

A practical guide to SEO for SaaS. How to pick winnable keywords, build topic clusters, and turn content into a channel that compounds instead of resetting.

Programmatic SEO for SaaS, done without spam. How to template pages that target long-tail queries, keep them useful, and avoid thin-content penalties.

Keyword cannibalization, fixed. How two pages targeting one keyword hurt rankings, how to spot it in Search Console, and how to consolidate or differentiate.