GEO: what the original research says about visibility in AI answers

In November 2023, a paper appeared that formally described for the first time what it takes to be visible in AI-generated answers. The authors called it Generative Engine Optimization, or GEO. The paper ”GEO: Generative Engine Optimization” has since become one of the most cited works in this new field.

We break it down here. Not the abstract, but the actual research: what they test, how they measure, and what the results mean for content creators.

Why this research matters

Traditional search engines give you a list of links. You click, or you don’t. Your visibility is determined by your position in that list.

Generative search engines work differently. An AI model generates an answer and cites sources within it: sometimes prominently, sometimes barely, sometimes not at all. You can be the first organic result in Google and still be completely ignored by an AI that prefers to cite someone else.

That raises a question the authors of this paper set out to answer: what can you do to your content to be cited more often and more prominently?

GEO-bench: how the experiment works

The researchers build a dataset of 10,000 queries, drawn from nine existing sources: from anonymous Bing and Google queries (MS MARCO, ORCAS-1, Natural Questions) to complex reasoning queries from Oxford University and trending queries from Perplexity.ai.

The queries are categorised by:

Type: factual, opinion, comparison
Domain: 25 categories (Law & Government, Health, Science, Business, Arts, Games, etc.)
Intent: research, purchase, entertainment
Difficulty: simple to complex

For each query, they select relevant web pages. These are provided as sources to a generative model, which then produces an answer. They then measure how prominently each source is cited in that answer.

How they measure ‘visibility’

This is one of the most interesting parts of the paper, because here lies a fundamental difference from SEO.

In traditional SEO, visibility is one or the other: you are on position 1, or you are lower. In GEO, visibility is gradual: your content can make up 30% or 5% of the generated answer.

The researchers introduce two metrics:

Position-Adjusted Word Count counts how many words from your source appear in the answer, weighted by position. A citation early in the answer counts more than one at the end, based on research into how users scan longer texts.

Subjective Impression evaluates seven subjective factors via a language model: how relevant is the citation, how dependent is the answer on it, how unique is the material, how likely is a user to click through?

The baseline (unoptimised content in its original state) scores an average of 19.5 on the Position-Adjusted Word Count.

The nine strategies, tested

The researchers test nine ways to optimise a web page, and measure what happens to visibility:

Strategy	What it involves	Improvement
Quotation Addition	Add relevant quotes from authoritative sources	+41%
Statistics Addition	Replace qualitative claims with quantitative data	+33%
Fluency Optimization	Improve flow and readability	+29%
Cite Sources	Add references to reliable sources	+28%
Easy-to-Understand	Simplify language for a broader audience	+14%
Technical Terms	Add technical terminology	+18%
Authoritative	Write in a more persuasive, authoritative tone	+12%
Unique Words	Use more varied vocabulary	+6%
Keyword Stuffing	Add more query-relevant keywords	-9%

The last point deserves extra attention. Keyword stuffing, a classic SEO technique, makes you less visible in AI answers. Not neutral: actively worse. Generative models understand text contextually and are not misled by keyword density.

What works best and why

The top three (citations, statistics, and fluent writing) have one thing in common: they increase the credibility and information density of your content. An AI model generating an answer selects sources that strengthen it. Content with concrete figures and verifiable claims is more useful for that purpose than vague, qualitative statements.

Fluency Optimization scores surprisingly high in the ranking. The researchers explain this because language models can process and paraphrase text more easily when the reasoning is logical and the sentence structure is clear. Poor text is cited less well, not because the model fails to understand it, but because it is harder to integrate.

Keyword Stuffing stands out because it does the exact opposite of what the other strategies do: it lowers the quality and information density of the text. And low quality equals low visibility.

Domain-specific patterns

One general strategy does not work equally well everywhere. The researchers find clear patterns per domain:

Quotation Addition works best for People & Society, History, and explanatory queries
Statistics Addition is most effective for Law & Government, debate and opinion queries
Cite Sources has the greatest effect on factual queries and legal content
Authoritative performs above average for debate queries and historical content
Fluency Optimization works particularly well in Business and Science

The conclusion: knowing which domain your content falls into allows for more targeted optimisation than a generic approach.

The democratising effect

This is the most striking result in the paper, and the least expected.

The researchers look at what happens when all websites in the experiment apply the same GEO strategy. Who wins the most? Not the website already at the top of Google. Precisely the websites at positions 4 and 5.

Google position	Change in AI visibility
Position 1	-30.3%
Position 2	+2.5%
Position 3	+20.4%
Position 4	+15.5%
Position 5	+115.1%

The position-1 website loses visibility in AI answers. The position-5 website gains more than twice as much.

That sounds counter-intuitive, but the explanation is simple. Traditional search engines reward authority: whoever has the most backlinks and has been at the top for a long time stays there. Generative search engines read your content and assess whether it gives the best answer to the question. A smaller player with stronger, better-written content can beat a large competitor that has held position 1 for years.

For businesses that have never achieved top positions in Google, this is relevant. AI search reshuffles the playing field based on content quality, not on who has been around longest.

How GEO compares to traditional SEO

Aspect	Traditional SEO	GEO
Visibility model	All or nothing per position	Gradual, from 0% to X% of the answer
Ranking factors	Backlinks, authority, technical	Content quality, information density, source citation
Metric	Average ranking	Word count, position in answer, relevance
Impact of keyword stuffing	Historically positive	Actively negative

GEO does not replace SEO. A generative search engine gets its sources from somewhere, and those sources are largely determined by what is findable. Traditional SEO remains relevant for that first selection step. GEO determines what happens with it next.

What this means for your content strategy

The practical translation of this research is more concrete than you might expect:

Replace qualitative claims with quantitative data. “Many companies use X” becomes “67% of Fortune 500 companies use X (Gartner, 2024).” That is precisely what Statistics Addition does, and it delivers an average of 33% more visibility.

Add direct quotes from authoritative sources. Not as decoration, but as a substantive part of the text. Quotation Addition is the best-performing strategy in the research.

Write for people, not search engines. Fluency Optimization (clearer, more fluent writing) outperforms Technical Terms and Unique Words. The irony is that optimising for human readability turns out to also mean optimising for AI citation.

Reference your sources. If you make a claim, link to the source. Cite Sources performs an average of 28% better than unoptimised content, and it also strengthens credibility for human readers.

Stop keyword stuffing. If you are still doing it: it no longer works for traditional SEO, and it actively harms your visibility in AI search results.

This paper is the theoretical foundation of the field. Much of what has been researched since builds on it. Want to know how your content scores on the GEO criteria that proved most effective in this research? Get in touch with us, we are happy to take a concrete look at your situation.

GEO: what the original research says about visibility in AI answers

Why this research matters#

GEO-bench: how the experiment works#

How they measure ‘visibility’#

The nine strategies, tested#

What works best and why#

Domain-specific patterns#

The democratising effect#

How GEO compares to traditional SEO#

What this means for your content strategy#