SEO 26 February 2026 · Koen Bol Koen Bol

How Google Discover really works: research reveals the nine steps behind your feed

An independent study into the internal architecture of Google Discover makes visible for the first time how content is selected, ranked, and blocked. What does this mean for publishers?

Google DiscoverContentTechnical

For most publishers, Google Discover is a black box. You publish something, and it either gets picked up or it doesn’t. Sometimes it works fantastically, sometimes your traffic disappears without any clear cause. Google itself tells you little more than to make “high-quality content” and use good images.

But there is now far less guesswork involved. An independent researcher analysed the Discover app via SDK telemetry and reconstructed the nine steps your content goes through before it appears in a feed - or doesn’t. Search Engine Land wrote about it too. The technical details are in the original research by Metehan Yesilyurt.

That is the core of what the research reveals.

Nine steps, not one decision

Discover doesn’t work as one big model that decides what you see. It is a pipeline of nine consecutive steps, where content can drop out at each stage:

  1. Content ingestion: your content is crawled and entities are recognised via the Knowledge Graph
  2. Open Graph tag parsing: six specific OG tags are parsed
  3. Content classification: content is categorised into cluster types
  4. Publisher blocking: a binary check - is this publisher allowed through at all?
  5. User interest matching: is your content matched to the user’s interests?
  6. pCTR ranking: a prediction model estimates the probability of a click
  7. Feed assembly: results are placed into a hierarchical structure
  8. Delivery: content is pushed to the device via various technical channels
  9. Feedback loop: user behaviour (clicks, swipes, saves) is fed back into the system

Those are nine moments where your content can be stopped. And the order is not arbitrary: some steps decide earlier than you might think.

The most underestimated step: publisher-level blocking

This is the part of the research you really need to remember.

Step four, publisher blocking, takes place before interest matching and ranking. That means: if users have actively blocked your domain, or if your domain is suppressed as a whole, your content never reaches the ranking phase. There is no point thinking about titles or images. You are filtered out earlier.

What makes it even more frustrating: there is nothing comparable on the other side. There is no “boost collection” flag. Suppression works; boosting does not. The system is asymmetric.

And this is cumulative. If enough users swipe away a single article, that can affect the entire domain. There are three layers of exclusion: an initial swipe, a filter status, and a permanent block. Once in that third layer, that URL never comes back for that user.

What Open Graph actually does in Discover

Of all the meta information you can provide, Discover picks exactly six:

  • og:image
  • og:title
  • og:site_name
  • og:locale
  • og:image:secure_url
  • article:content_tier

Missing og:image? No card will be rendered. Without it your content simply does not exist in Discover.

Images need to be at least 1200 pixels wide for a prominent hero card. Smaller images result in a thumbnail display with demonstrably fewer clicks. The image requirement is not a recommendation - it is a hard limit.

There is also a fallback chain: if OG tags are missing, Discover will try to use Twitter Card tags. But you should not rely on this. It is an emergency exit, not a strategy.

How ranking works: pCTR and og:title

The actual ranking runs on a model that predicts the expected click probability: the predicted click-through rate, or pCTR. This model runs server-side, so you cannot measure or reverse-engineer it from the outside.

What the research does show: the payload sent to the server before the ranking decision contains the og:title and image metrics. That is not proof that your title is literally scored, but it is the strongest indication yet that your title and image directly factor into ranking.

In practice: write titles that align with genuine interest, rather than chasing clicks. The system optimises on predicted click behaviour, combined with a feedback loop on dwell time and engagement after the fact.

Freshness: a precise system

Discover does not have a vague preference for recent content. There are three weighted buckets:

  • 1–7 days: strongest boost
  • 8–14 days: moderate visibility
  • 15–30 days: limited visibility
  • 30+ days: continuous decline

Evergreen content receives a separate classification and falls outside this ageing logic. If you publish content that is clearly timeless - think a guide, an explanation, or a comparison - it is worth making that clear through structure and signals.

150 simultaneous experiments

One detail that explains why Discover can feel so unpredictable: the system runs approximately 150 parallel A/B experiments at any given moment, plus more than 50 feature flags steering behaviour across 15 categories.

Two users with nearly identical interests can therefore see a completely different feed, purely because they are in different experiment groups. That makes Discover extra difficult to debug. If your traffic fluctuates, it might be because of your content - but it could just as easily be the rollout of an experiment.

Two meta tags that completely exclude your content

Finally, something almost nobody knows: there are two specific meta tags that completely block Discover inclusion:

<meta name="robots" content="nopagereadaloud">
<meta name="robots" content="notranslate">

Using either of these tags? Then your content can never reach Discover. Check this for your site, especially if you use themes or CMS plugins that may add these automatically.

What does this mean for you as a publisher?

This research explains why the well-known advice holds up - and what specifically lies behind it.

Your og:image is not a nice-to-have. Without a correct image of at least 1200 pixels wide, your content does not exist in Discover. No exceptions.

Write titles that earn clicks, not demand them. The pCTR model optimises on predicted click behaviour, but the feedback loop corrects for real engagement. A misleading title generates clicks in the short term, but undermines the long-term position of your domain.

Protect your domain reputation. Publisher blocking works asymmetrically and cumulatively. Content that performs poorly drags the rest down with it. That argues for quality control, not just volume.

Publish on consistent schedules. Discover strongly favours fresh content: content that is one day old has a much better chance than content that is three weeks old. Deliberately align your publishing schedule accordingly.

Not everything is in your hands. One hundred and fifty simultaneous experiments means that part of your Discover variation is outside your control. Focus on the factors you can actually influence.


Want to know how your site scores on these technical requirements, or how your Discover strategy is holding up? Get in touch with us - we’re happy to take a concrete look together.