SEO has always had a bit of a split personality. On one side, you’ve got the spreadsheets, ranking factors, backlinks, and audits. On the other, you’ve got this almost philosophical advice: “Add value. Be original. Say something new.”
For years, that second part sounded nice, but vague. Then Google filed the “Contextual estimation of link information gain” patent, and suddenly that advice got better.
This patent tries to measure originality and if you’re creating content for search, this changes the game.
What is the Information Gain Patent?
At its core, the “Contextual estimation of link information gain” patent describes a system that asks one deceptively simple question:
“If a user has already consumed some documents on a topic, how much new information will they gain from the next one?”
Information gain is novelty in context. Not relevance, authority, or freshness. According to the patent, Google tracks:
- What documents a user has already seen (or heard, via Assistant).
- What semantic information those documents contain.
- How much overlap exists between previously consumed documents and new ones.
- And then assigns an information gain score to each new candidate document.
The more new information a document adds, the higher its score.
Why Information Gain Matters for SEO?
Classic SEO ranking models assume a mostly static world:
- User searches a query
- Google ranks documents
- User clicks one
- Session ends
The Information Gain model assumes something much more realistic:
- Users consume multiple results
- They refine queries
- They return to SERPs
- They ask follow-ups
- They get bored when results repeat themselves
Google wants to optimize not first clicks, but entire information journeys.
All that means – ranking result #1 isn’t enough, being different from result #1 matters, and redundancy becomes a liability.
How Information Gain Is Calculated
The patent outlines a machine-learning approach that works roughly like this:
- Documents are converted into semantic representations
- Vector embeddings
- Bag-of-words or learned representations (think Word2Vec-style models)
- Previously viewed documents are grouped
- A “first set” of documents already consumed by the user
- New candidate documents are evaluated
- A “second set” of documents not yet viewed
- A model compares semantic overlap
- If document B mostly repeats document A – low information gain
- If document B introduces new entities, causes, steps, perspectives – higher information gain
- Documents are re-ranked dynamically
- Rankings can change after a user clicks something
- Documents can be demoted, promoted, or even excluded if they add nothing new
In other words Google is ranking what the user hasn’t learned yet. And that’s beautiful.

Focus on Assistants
One of the most revealing parts of the patent is how heavily it focuses on automated assistants and text-to-speech.
You might ask why? Because spoken answers are linear, you can’t skim audio. And Google is hyper-motivated to avoid repeating information, strip redundancy, and deliver maximum insight per second.
The patent explicitly describes:
- Removing already-heard information from later answers.
- Extracting only novel sections from new documents.
- Shortening dialog sessions by prioritizing high information gain.
Information gain is operational and anything that works for Assistant eventually bleeds into Search.
The SEO Takeaways
Let’s translate patent language into action.
1. “Comprehensive” Is No Longer Enough
If your article is just a cleaner version of the top 5 results, a remix of common steps, or the same headings, same flow, same examples – you might still rank, but you’re vulnerable.
Because the moment Google understands that your page adds no new information, your information gain score drops. Depth is differentiation.
2. Novelty Can Be Structural, Not Just Factual
You don’t always need brand-new facts. Information gain can come from:
- A new framework
- A better mental model
- A clearer causal explanation
- A different sequencing of ideas
- Combining concepts that are usually explained separately
If everyone answers “what”, answer “why” or “when this fails”.
3. Follow-Up Content Is a Ranking Opportunity
Because rankings can be recalculated after a click, think in sequences:
- Intro article – overview
- Follow-up article – edge cases
- Advanced article – tradeoffs and limitations
If your content is positioned as “Here’s what you haven’t seen yet”, you align perfectly with information gain logic.
4. Redundancy Is Now an SEO Risk
Historically, repeating key phrases and steps felt “safe.”
In an information gain world repetition without expansion comes with a low marginal value.
Similar subheadings across articles lead to semantic overlap. Last but not least, copycat content hides a big demotion risk.

Quiet Shift Toward Meaning-Based Search
Zooming out, this patent fits into a much bigger world of semantic search, passage ranking, helpful content systems, conversational search, and multi-turn queries.
Google is currently asking “Is this page useful right now, given what the user already knows?”.
This change rewards original thinking, clear explanations, honest expertise, and real help for user progress.
Final Thought
The Information Gain patent quietly reframes SEO as something closer to teaching. A good teacher doesn’t repeat the same lesson. Instead, they build on what you already understand.
Information gain is all about moving the user forward and that’s the kind of optimization worth doing.

