Skip to main content
AI Visibility 6 March 2026 8 min read

AI Discovery Files: The Identity Layer Your Website Is Missing

The web has seen this pattern twice before: robots.txt in 1994, sitemaps in 2005. AI Discovery Files are at that same early stage. 99.7% of UK websites have none deployed. Here's what they are, why they matter, and how to get started.

MM
Mark McNeece Founder & Managing Director, 365i
Vibrant illustration of a glowing website identity card floating at the centre of concentric light rings, surrounded by AI brain icons drawn to it like a signal beacon, on a deep navy-teal gradient background

I've seen this pattern before. Twice.

In 1994, Martijn Koster proposed robots.txt on a mailing list. Not a formal specification. Not backed by any standards body. Just a simple text file that told web crawlers which parts of a site to avoid. Webmasters started using it because it solved a real problem, and within a few years every serious website had one. The formal RFC didn't arrive until 2022. Twenty-eight years after the file was already everywhere.

In 2005, Google introduced the XML sitemap protocol. Same story. A practical file that helped search engines discover pages. Yahoo and Microsoft joined within eighteen months. WordPress added automatic generation in version 5.5, and suddenly every WordPress site on the planet had one.

Both times: someone proposed a simple file, it solved a real problem, tools made it easy, and by the time the formal spec caught up, the format was already universal.

AI Discovery Files are at that same early stage right now. And most businesses haven't noticed.

The Problem: AI Systems Don't Read Websites Like Google Does

Split illustration comparing search engine crawling (left, blue side showing pages being indexed in a ranked list) with AI entity understanding (right, coral side showing a rich identity card with business details radiating connections)
Search engines index pages and rank them in a list. AI systems try to understand what a business actually is and does.

Google looks at your pages, follows your links, and ranks your content against everyone else's. Twenty-five years of SEO practice has shaped how websites present themselves to search engines. Title tags, meta descriptions, heading structure, internal links. Your site was built for that process.

AI assistants work differently. When someone asks ChatGPT "who offers managed WordPress hosting in the UK?", it doesn't return ten blue links. It builds an understanding of relevant businesses as entities, then assembles a response. Who you are. What you do. Where you operate. Which of your pages actually matter.

Most websites give AI systems almost nothing to work with for that job. Your homepage has a hero banner, marketing copy, a pricing grid. Fine for humans. But an AI trying to understand your service boundaries, geographic focus, or brand identity? It's piecing together fragments from nav menus and footer text.

Danny Sullivan's advice from December 2025: write for humans, not ranking systems, "whether those systems are traditional search or LLM-powered experiences" (Search Engine Land).

He's right. But even brilliant human-first content doesn't solve the structural problem. AI needs a machine-readable summary of who you are. That's what AI Discovery Files provide.

What AI Discovery Files Actually Are

Plain-text files placed at the root of your website. Each one gives AI systems a different piece of the puzzle. Together, they form an identity layer: a structured, machine-readable picture of your organisation.

Four files do most of the work:

  • llms.txt tells AI which pages to read first and provides a Markdown overview of your site
  • identity.json gives structured business data: name, description, location, contacts
  • brand.txt tells AI how to spell and refer to your brand correctly
  • faq-ai.txt provides verified answers to common questions about your business

There are up to 10 files in total, covering everything from AI crawling permissions to developer documentation. But those four cover the ground that matters most.

No special server configuration needed. No plugins required (though one exists, and we'll get to that). You can create them in a text editor and upload them to your web root.

Why This Matters Right Now

Timeline illustration showing the evolution of web standards: robots.txt in 1994 (blue glow), XML sitemaps in 2005 (green glow), and AI Discovery Files in 2024-2026 (coral glow), connected by a flowing luminous path
Web standards that changed how the internet works: robots.txt (1994), sitemaps (2005), and now AI Discovery Files (2024 onward).

The specifications already exist. Stripe, Cloudflare, Anthropic, and Perplexity are publishing llms.txt on their own domains. SE Ranking's study of 300,000 domains found 10.13% already have llms.txt deployed. That's roughly where sitemap.xml was in its second year.

Not all major AI systems officially confirm they read these files yet. But that was true of robots.txt and sitemaps too.

Here's what we do know. One of our hosting customers, a Northamptonshire accountancy firm, added identity.json and llms.txt in January. Three weeks later, we tested their business name in ChatGPT. Before: a vague two-sentence description pulled from an old directory listing. After: an accurate summary of their actual services and locations.

We can't prove causation. AI models update for many reasons. But the timing was hard to ignore, and we've since seen similar shifts with four other customers who deployed the full file set.

Jeremy Howard, who proposed the llms.txt format through Answer.AI in September 2024, put it simply: "Site authors know best, and can provide a list of content that an LLM should use."

That's the principle behind every AI Discovery File. You know your business better than any crawling algorithm.

What AI Gets Wrong Without Them

Illustration showing a website with a structured identity layer (left side) where AI bots understand verified business data correctly, versus an unstructured website (right side) where AI bots show confusion and question marks
Websites with a structured AI identity layer give AI systems verified facts. Without it, AI systems guess, and they often guess wrong.

Ask ChatGPT about a business that hasn't deployed these files. It'll pull from whatever it can find: a three-year-old directory listing, a customer review on a forum, a competitor's comparison page. It might get the location wrong. It might confuse the services. It might miss the specialisms entirely.

Now ask about a business that has them. identity.json says where they are and what they do. brand.txt says how to spell their name. faq-ai.txt answers the questions people actually ask. The AI doesn't have to guess.

99.7% of UK websites have zero AI discovery files deployed. That gap won't last forever. The AI Visibility Checker shows where your site stands right now.

Getting Started on WordPress

Illustration of a monitor showing the WordPress logo, with colourful AI discovery file icons streaming outward: llms.txt in blue, ai.txt in green, brand.txt in purple, faq-ai.txt in amber, and identity.json in coral, each connecting to different AI system icons
The WordPress plugin generates AI discovery files automatically from your existing site data, making adoption as simple as filling in a form.

WordPress powers 43% of the web. When WordPress makes something easy, it spreads. That's what happened with sitemaps. It's starting to happen with AI Discovery Files. And with WordPress 7.0's AI Experiments plugin letting AI help you create content, the relationship between your site and AI is becoming two-way.

A free plugin called AI Discovery Files is now in the WordPress.org repository. It auto-detects your site name and tagline, then generates up to 10 files from a settings form. Fill in your business details, hit save, done. Our plugin guide covers the full setup.

Over 300 downloads in the first few days. For a category that barely existed six months ago, that's real interest.

Not on WordPress? The complete guide to all 10 files walks through creating them manually. They're plain text. Any web server can host them.

What I'd Do Today

The $1 billion valuation of Profound, a company that tracks what AI says about brands, tells you where this market is heading. Enterprise firms are already spending serious money on AI visibility. Small businesses can get ahead for free.

  1. Check where you stand. Run your domain through the AI Visibility Checker. It scans for all 10 file types and tells you what's missing.
  2. Deploy the essentials. On WordPress, install the plugin. Otherwise, create llms.txt, identity.json, brand.txt, and faq-ai.txt manually.
  3. Validate. Use the validation process we documented to confirm AI systems can actually read your files.
  4. List your site. The AI Visible Directory gives sites with 2+ files full page listings with real dofollow links.

The cost of being wrong? A few text files sitting quietly on your server. The cost of being right, and being early? AI assistants recommending your business instead of your competitors.

Where This Is Heading

AI Discovery Files won't replace SEO. They won't guarantee ChatGPT mentions your business. They're not a ranking factor.

What they do is give your website a machine-readable identity. The web has never had that in a format built for AI. And the pattern from robots.txt and sitemaps is clear: simple files that solve real problems become universal standards. Not because a committee mandates them. Because the tools exist and enough people start using them.

If you want help, our AI Discovery Files service handles the full setup. Or just start with the free WordPress plugin and see where it takes you.

Frequently Asked Questions

What are AI Discovery Files?

AI Discovery Files are a set of machine-readable files placed at the root of your website. They provide structured information about your business, brand, services, and content to AI systems like ChatGPT, Claude, and Gemini. There are up to 10 file types, including llms.txt, identity.json, brand.txt, and faq-ai.txt. Each serves a specific purpose, from declaring your business identity to providing verified FAQ answers.

Do AI systems actually read AI Discovery Files?

Major AI companies haven't officially confirmed full support for all file types yet. Anthropic, Stripe, Cloudflare, and Perplexity already publish llms.txt on their own sites, which suggests they see value in the format. SE Ranking found 10.13% of 300,000 domains already have llms.txt deployed. The pattern mirrors robots.txt and sitemaps, which were widely adopted before formal standards existed.

Are AI Discovery Files the same as robots.txt?

No. robots.txt tells crawlers which pages to avoid. AI Discovery Files tell AI systems which pages matter most and provide structured identity data about your organisation. They're complementary: robots.txt handles access control, while AI Discovery Files handle identity and content priority. One of the AI Discovery File types, robots-ai.txt, specifically addresses AI crawler permissions.

Why do AI Discovery Files matter for businesses?

When AI assistants answer questions about your industry, they'll describe businesses based on whatever information they can find. Without AI Discovery Files, that means guessing from fragmented web data. With them, you're giving AI a structured, verified source of truth about your business: your name, services, location, brand guidelines, and verified FAQ answers.

Can WordPress generate AI Discovery Files automatically?

Yes. The free AI Discovery Files plugin on WordPress.org auto-detects your site name and tagline, then generates up to 10 files from a settings form. It requires WordPress 6.2+ and PHP 8.0+. The more business details you enter, the richer the generated files.

Is publishing AI Discovery Files worth it now?

The downside risk is near zero: a few small text files on your server. The upside is establishing your AI identity before competitors do. robots.txt and sitemaps both followed this pattern: early adopters had an advantage, and by the time everyone caught up the window had closed. With 99.7% of UK websites having zero AI discovery files deployed, the early-mover opportunity is still wide open.

How do AI Discovery Files help with AI visibility?

AI visibility means your business appears accurately when AI assistants answer questions about your industry. AI Discovery Files contribute by providing structured identity data (identity.json), content priorities (llms.txt), brand guidelines (brand.txt), and verified answers (faq-ai.txt). This gives AI systems a reliable source rather than forcing them to piece together information from third-party sites and outdated listings.

Check Your AI Visibility Score

Run your domain through the AI Visibility Checker. It scans for all 10 file types and tells you exactly what's missing, free.

Check Your Site

Sources