GEO 12 March 2026 · Hessel Middendorp Hessel Middendorp

GEO: what the original research says about visibility in AI answers

The paper that defined the field of GEO tested nine optimisation strategies across thousands of queries. The results are concrete, sometimes surprising, and relevant to anyone who creates content.

GEOAIResearch

In November 2023, a paper appeared that formally described for the first time what it takes to be visible in AI-generated answers. The authors called it Generative Engine Optimization, or GEO. The paper ”GEO: Generative Engine Optimization” has since become one of the most cited works in this new field.

We break it down here. Not the abstract, but the actual research: what they test, how they measure, and what the results mean for content creators.

Why this research matters

Traditional search engines give you a list of links. You click, or you don’t. Your visibility is determined by your position in that list.

Generative search engines work differently. An AI model generates an answer and cites sources within it: sometimes prominently, sometimes barely, sometimes not at all. You can be the first organic result in Google and still be completely ignored by an AI that prefers to cite someone else.

That raises a question the authors of this paper set out to answer: what can you do to your content to be cited more often and more prominently?

GEO-bench: how the experiment works

The researchers build a dataset of 10,000 queries, drawn from nine existing sources: from anonymous Bing and Google queries (MS MARCO, ORCAS-1, Natural Questions) to complex reasoning queries from Oxford University and trending queries from Perplexity.ai.

The queries are categorised by:

  • Type: factual, opinion, comparison
  • Domain: 25 categories (Law & Government, Health, Science, Business, Arts, Games, etc.)
  • Intent: research, purchase, entertainment
  • Difficulty: simple to complex

For each query, they select relevant web pages. These are provided as sources to a generative model, which then produces an answer. They then measure how prominently each source is cited in that answer.

How they measure ‘visibility’

This is one of the most interesting parts of the paper, because here lies a fundamental difference from SEO.

In traditional SEO, visibility is one or the other: you are on position 1, or you are lower. In GEO, visibility is gradual: your content can make up 30% or 5% of the generated answer.

The researchers introduce two metrics:

Position-Adjusted Word Count counts how many words from your source appear in the answer, weighted by position. A citation early in the answer counts more than one at the end, based on research into how users scan longer texts.

Subjective Impression evaluates seven subjective factors via a language model: how relevant is the citation, how dependent is the answer on it, how unique is the material, how likely is a user to click through?

The baseline (unoptimised content in its original state) scores an average of 19.5 on the Position-Adjusted Word Count.

The nine strategies, tested

The researchers test nine ways to optimise a web page, and measure what happens to visibility:

StrategyWhat it involvesImprovement
Quotation AdditionAdd relevant quotes from authoritative sources+41%
Statistics AdditionReplace qualitative claims with quantitative data+33%
Fluency OptimizationImprove flow and readability+29%
Cite SourcesAdd references to reliable sources+28%
Easy-to-UnderstandSimplify language for a broader audience+14%
Technical TermsAdd technical terminology+18%
AuthoritativeWrite in a more persuasive, authoritative tone+12%
Unique WordsUse more varied vocabulary+6%
Keyword StuffingAdd more query-relevant keywords-9%

The last point deserves extra attention. Keyword stuffing, a classic SEO technique, makes you less visible in AI answers. Not neutral: actively worse. Generative models understand text contextually and are not misled by keyword density.

What works best and why

The top three (citations, statistics, and fluent writing) have one thing in common: they increase the credibility and information density of your content. An AI model generating an answer selects sources that strengthen it. Content with concrete figures and verifiable claims is more useful for that purpose than vague, qualitative statements.

Fluency Optimization scores surprisingly high in the ranking. The researchers explain this because language models can process and paraphrase text more easily when the reasoning is logical and the sentence structure is clear. Poor text is cited less well, not because the model fails to understand it, but because it is harder to integrate.

Keyword Stuffing stands out because it does the exact opposite of what the other strategies do: it lowers the quality and information density of the text. And low quality equals low visibility.

Domain-specific patterns

One general strategy does not work equally well everywhere. The researchers find clear patterns per domain:

  • Quotation Addition works best for People & Society, History, and explanatory queries
  • Statistics Addition is most effective for Law & Government, debate and opinion queries
  • Cite Sources has the greatest effect on factual queries and legal content
  • Authoritative performs above average for debate queries and historical content
  • Fluency Optimization works particularly well in Business and Science

The conclusion: knowing which domain your content falls into allows for more targeted optimisation than a generic approach.

The democratising effect

This is the most striking result in the paper, and the least expected.

The researchers look at what happens when all websites in the experiment apply the same GEO strategy. Who wins the most? Not the website already at the top of Google. Precisely the websites at positions 4 and 5.

Google positionChange in AI visibility
Position 1-30.3%
Position 2+2.5%
Position 3+20.4%
Position 4+15.5%
Position 5+115.1%

The position-1 website loses visibility in AI answers. The position-5 website gains more than twice as much.

That sounds counter-intuitive, but the explanation is simple. Traditional search engines reward authority: whoever has the most backlinks and has been at the top for a long time stays there. Generative search engines read your content and assess whether it gives the best answer to the question. A smaller player with stronger, better-written content can beat a large competitor that has held position 1 for years.

For businesses that have never achieved top positions in Google, this is relevant. AI search reshuffles the playing field based on content quality, not on who has been around longest.

How GEO compares to traditional SEO

AspectTraditional SEOGEO
Visibility modelAll or nothing per positionGradual, from 0% to X% of the answer
Ranking factorsBacklinks, authority, technicalContent quality, information density, source citation
MetricAverage rankingWord count, position in answer, relevance
Impact of keyword stuffingHistorically positiveActively negative

GEO does not replace SEO. A generative search engine gets its sources from somewhere, and those sources are largely determined by what is findable. Traditional SEO remains relevant for that first selection step. GEO determines what happens with it next.

What this means for your content strategy

The practical translation of this research is more concrete than you might expect:

Replace qualitative claims with quantitative data. “Many companies use X” becomes “67% of Fortune 500 companies use X (Gartner, 2024).” That is precisely what Statistics Addition does, and it delivers an average of 33% more visibility.

Add direct quotes from authoritative sources. Not as decoration, but as a substantive part of the text. Quotation Addition is the best-performing strategy in the research.

Write for people, not search engines. Fluency Optimization (clearer, more fluent writing) outperforms Technical Terms and Unique Words. The irony is that optimising for human readability turns out to also mean optimising for AI citation.

Reference your sources. If you make a claim, link to the source. Cite Sources performs an average of 28% better than unoptimised content, and it also strengthens credibility for human readers.

Stop keyword stuffing. If you are still doing it: it no longer works for traditional SEO, and it actively harms your visibility in AI search results.


This paper is the theoretical foundation of the field. Much of what has been researched since builds on it. Want to know how your content scores on the GEO criteria that proved most effective in this research? Get in touch with us, we are happy to take a concrete look at your situation.