Schema Markup for AI Search: What Actually Helps

Editorial illustration of a structured data label tag attached to a document, with a search panel reading the visible page rather than the tag

Key Takeaways

Google's documentation states there is no special schema.org markup you need to add to appear in AI Overviews or AI Mode. Eligibility is being indexed and snippet-eligible, which is plain SEO.
The one causal study on this question (Ahrefs, 1,885 pages that added JSON-LD vs 4,000 controls) found no citation uplift on AI Mode or ChatGPT, and a small decline in AI Overviews.
The 2.5x and 3x correlation figures vendors quote are a selection effect: schema-rich sites are also better maintained, more authoritative, and earn more links, and those signals carry the citation.
Schema's real jobs are rich-result eligibility, entity disambiguation, and matching your visible content. Ship the handful of types your content actually is, then stop.
Citation comes from the visible content and off-site brand mentions, not the markup payload. That is where the freed-up time should go.

The short answer: schema does not get you cited

If you are adding schema markup because you read that it will earn you citations in ChatGPT, Perplexity, or Google's AI Overviews, the evidence does not support the bet. Google's own documentation is direct about it: to appear in AI Overviews or AI Mode, "you don't need to create new machine readable files, AI text files, or markup," and "there's also no special schema.org structured data that you need to add," per Google Search Central's documentation on AI in Search. Eligibility is being indexed and snippet-eligible. That is plain SEO.

That does not make structured data useless. It makes it something other than what most of the "schema for AI" advice claims. Schema is an entity-legibility and rich-result asset. It helps machines read who you are and what a page is, and it qualifies you for the result formats AI answers can pull from. Those are real jobs worth doing. Winning the citation is not one of them. The useful move is to ship the handful of schema types your content genuinely is, confirm they match the visible page, and then redirect the hours vendors want you to spend on "AI schema" toward the content and mentions that actually move the metric.

What the only causal study found

Most "schema helps AI" claims rest on correlation. Someone analyzes a few million cited pages, finds that cited pages carry more JSON-LD than uncited ones, and reports a multiple. Ahrefs found exactly that in a 6-million-URL scan: AI-cited pages were almost three times more likely to have JSON-LD than pages that were not cited. Read alone, that number sells a lot of schema audits.

Then Ahrefs ran the test that correlation cannot: a causal one. They tracked 1,885 pages that added JSON-LD schema between August 2025 and March 2026, matched them against 4,000 control pages, and measured the citation change across Google AI Overviews, AI Mode, and ChatGPT. Adding schema produced no meaningful uplift on any platform. AI Mode and ChatGPT moved by amounts small enough to be random noise across thousands of URLs. AI Overviews actually showed a small decline of 4.6% on treated pages relative to controls, statistically significant but too small to pin confidently on the markup itself. The summary line from the study: not much really changed.

Google's engineers have said the same thing in plainer language. At a Search Live event in mid-2025, Gary Illyes told attendees that to get content into AI Overviews you "simply use normal SEO practices. You don't need GEO, LLMO or anything else," as reported by Search Engine Roundtable. Two independent signals, one from the platform and one from a controlled test, point the same way: the markup is not the lever.

Why the correlation fools smart people

The 3x figure is real. The conclusion drawn from it is the problem. Schema markup tends to live on better-maintained, more technically sophisticated sites. Those same sites also publish stronger content and earn more authority and links over time. AI retrieval systems favor that kind of page for every reason except the markup. So cited pages over-index on schema and on a dozen other signals at the same time. Strip the JSON-LD out and the rest of the signals very likely still carry the page to the citation. Schema is a marker of a site that does everything else right, which is not the same as the cause of the result.

There is also a mechanical reason the markup rarely does the work people imagine. When an AI engine retrieves a page to answer a question, most current systems read the rendered, visible content, not the JSON-LD block sitting in the head. SALT.agency's test on AI Mode framed it cleanly: Google's search infrastructure can parse structured data, but the language-model layer reduces a page, schema and HTML alike, into a stream of tokens, and it draws from that tokenized text rather than from neat structured nodes. Models like ChatGPT and Claude do not read your FAQPage block as a tidy question-answer pair. They read the words on the page. If the answer is not in the visible text, the schema does not rescue it.

One more thing worth saying plainly: there is no AIPage type, no LLMOptimized property, no "AI Overview metadata" extension in schema.org. If a vendor is selling you "AI schema markup," ask which @type it uses and check it against Google's structured data gallery. It will be a standard type with a new name on the invoice.

What schema is actually for

Structured data earns its place for three reasons, none of which is "tricks the model into citing you."

The first is rich-result eligibility. Product cards, video previews, recipe cards, and organization knowledge panels are surfaces AI Overviews can pull into an answer, and those surfaces depend on the underlying page carrying the right schema. A VideoObject does not get your text cited, but it can get your video shown inside an Overview. That is a real surface you only qualify for with markup.

The second is entity disambiguation. Organization schema with sameAs links pointing to your Wikidata entry and your verified profiles elsewhere helps Google identify your brand consistently and connect it across the web. Article schema with a named author, a datePublished, and a dateModified makes it unambiguous who wrote a page and when. For a B2B company that shares a name with three other firms, or whose founder shares a name with a more famous person, that clarity is the difference between being recognized as a distinct entity and being merged into someone else's.

The third is confirmation. In controlled tests, pages with schema and matching visible content allowed cleaner, more accurate extraction than identical pages without it. Schema acts as a highlighter on content that is already there. This is also where the platform confirmations land. Search Engine Land reported that Google's Search team said in April 2025 that structured data gives an advantage in search results, and that Microsoft's Fabrice Canel confirmed in March 2025 that schema helps Bing's language models understand content for Copilot, which also feeds ChatGPT's search results. For those two surfaces, schema is confirmed infrastructure. It improves how cleanly your content is read. It does not decide whether you are read.

The stack worth shipping, and the rule that governs it

You can implement everything that matters here in an afternoon. The types that earn their keep for most B2B sites are Organization with sameAs links, Article or BlogPosting with author and both date fields, BreadcrumbList for hierarchy, and Product or LocalBusiness where the page genuinely describes one. Add FAQPage or HowTo only when the page truly is a set of questions or a procedure, and know that Google has narrowed FAQ rich results so most pages no longer earn the visual treatment even with valid markup.

The rule that governs all of it is non-negotiable: schema must match the visible content. If your Product markup lists a price the page does not show, Google can issue a manual action for hidden or mismatched structured data, and an AI engine reading the page may surface the wrong number under your name. Markup that contradicts the page is worse than no markup. Validate what you ship with Google's Rich Results Test, fix the missing required fields it flags, and move on. This is hygiene, not a project. Treating it as a recurring "AI optimization" line item is how agencies bill for the same JSON-LD twice.

For a fuller picture of how the citation game actually works underneath the markup, the mechanics in how AI engines decide what to cite and the targeting logic behind AI Overview SEO cover the part schema cannot.

Where the citation actually comes from

If the markup is not the lever, what is? The visible content and the signals around it. AI engines cite the page that most clearly and credibly answers the question, and they lean heavily on what other sites say about you. In Ahrefs' work on AI visibility, branded web mentions correlated with AI Overview visibility at roughly 0.664, more than three times the 0.218 for backlinks. That gap is the whole strategy in one number. The page has to answer the question in plain, extractable language, and the wider web has to corroborate that you are a credible source of that answer.

That is the Demand Engine in practice, and it is why we treat schema as a fifteen-minute hygiene gate rather than a strategy. Get the entity legible so machines know who you are. Get the page answering one real buyer question in language a model can lift. Then spend the real effort on being the source other people name. When I ran marketing for a 4x Inc. 5000 company from startup to exit, the pattern held long before AI search existed: the technical hygiene was table stakes, and the compounding came from being the credible answer everyone else pointed to. A growth system measures that on one number, the cited answers that turn into booked sales conversations, not on schema coverage.

So ship the four or five schema types your content actually is, confirm they match the page, and validate them once. Then close the structured-data tab. The SEO and AEO checklist covers the legibility basics, and if you want the visible content, entity signals, and off-site mentions run as one coordinated system rather than a pile of tactics, that is what the full growth system is built to do. The markup is the easy part. The answer is the work.

Written by

Joseph Perkins

Founder of Perkins Growth Systems

Joseph Perkins is the founder of Perkins Growth Systems. He builds connected growth systems for B2B by combining real-world growth strategy with demand capture, signal-based outreach, follow-up, reporting, and CRM workflows.

View author page LinkedIn