Google Deduplication Of Top Stories From Web Search Results – Search Engine Roundtable


We know Google has deduplication efforts for when a featured snippet is displayed so that it might remove the snippet from the main web results. Google may do the same, in some cases, for when a URL is listed in the top stories section and then not show it in the main web results.
Google in some cases may be removing the web search result snippet when it shows that same result in the top stories section of the Google search results.
Update: Here is how deduping works for top stories:


Just to cap off with the further clarification I promised, we deduplicate a link from web results if a link appears as the first link in Top Stories and if the Top Stories box appears before web results. If it comes after, we don't. And again, it's something we're reviewing.
Rest of the story continued but the update is in the tweet above.
Danny Sullivan of Google said this on Twitter after a complaint from The Verge. Danny said “it’s showing in Top Stories, it is getting deduplicated from the rest of the page. Deduplication can often be useful. Doing this search in the way that user might by using solution-seeking terms rather than unusual terms in the headline, there you are at the top in Top Stories plus deduplicating means there’s more variety from other publications. In searches like that, our systems also are going to generally seek to show the most helpful, reliable info they can. That’s why you don’t see a lot of duplicates of your article showing. Duplicates certainly exist, but it isn’t that helpful to show them. That leads to headline-oriented searches. As I said before, that’s super common among authors. I used to do it all the time, myself. But headline searches contain typically contain a lot of terms, so our systems shift to return pages that have those terms. his means authors are more likely to find duplicates, even though for typical searches that readers would do, these are unlikely to appear. But our deduplication feature may still kick in even for these, as was happening in this case. As I said, deduplication can be helpful. But we also understand the concern this might be raising. We’ve been doing this with Top Stories since last May, but we’re going to revisit this to see if we should continue or perhaps make other changes. Also, I’m still checking, but I believe this deduplication is especially unique in that it only happens with Top Stories if there’s a single story shown or perhaps only for the very first story shown.”
So you can see, Danny is giving this as an excuse for why other publications are in the web results and not The Verge for that query. But as you can also see, he seems to explain that sometimes it does not work this way.


Deduplication can often be useful. Doing this search in the way that user might by using solution-seeking terms rather than unusual terms in the headline, there you are at the top in Top Stories plus deduplicating means there’s more variety from other publications…. pic.twitter.com/638IAZLWIV
The interesting part is it does seem that Google is indeed deduping here when it comes to top stories.
The funny thing, hours before this, News SEO Barry Adams did a whole Twitter thread on this:


This is not the case. It can be easily disproven; look for recent articles published on a major news outlet (BBC, The Guardian, NYT, etc). Search for relevant keywords relating to that article.

Chances are, you’ll see the article in the news box as well as on the regular SERP. pic.twitter.com/3u7HH7jppq


So a newly published article has had very little opportunity to rank in regular SERPs. It may show in Top Stories boxes, but ranking in regular SERPs could take days or weeks or even months.

Plus, authoritative publishers have an edge here, because, well, they’re authoritative.


In summary; there is no ‘news’ filter on Google SERPs. Articles can and do show up in both Top Stories and regular SERPs.

But often this doesn’t happen due to issues relating to speed, intent, and competition.

/end
So I pointed out to Barry Adams the tweet from Danny and his response:


Colour me shocked, Danny being wrong on something. 🤪
So I did some sample searches and Google does seem to remove a recent story from this site from the top stories and show it in web results, and at the same time, remove those stories in top stories from the web results. Notice SER is not in top stories here but is in the web results and at the same time SEL and SEJ is in the top stories but not in the web results (click to enlarge):
click for full size
But yes, the story does show in the news tab:
click for full size
But sometimes Google is not deduplicating these top stories, a query for [the hidden resignation] shows the Business Insider story in top stories in some browsers twice and some once (thanks Glenn Gabe for the query and discussing it with me):
Deduped:
click for full size
Not Deduped:
click for full size
Here is another example from Glenn:


Another good example of a url showing up in Top Stories that also ranks in the 10 blue links. But from what Danny explained yesterday, Google’s deduplication system for Top Stories is nuanced. So it’s not going to happen all of the time (which makes sense based on other queries). pic.twitter.com/ywiwiQnEfw
So it is not clear when and why Google might deduplicate a URL from showing in web search when it also shows in top stories. It might be a timing issue or something else.
Danny Sullivan did say “Also, I’m still checking, but I believe this deduplication is especially unique in that it only happens with Top Stories if there’s a single story shown or perhaps only for the very first story shown.”


Also, I’m still checking, but I believe this deduplication is especially unique in that it only happens with Top Stories if there’s a single story shown or perhaps only for the very first story shown.
We hope to get some clarity soon.
Forum discussion at Twitter.
The content at the Search Engine Roundtable are the sole opinion of the authors and in no way reflect views of RustyBrick ®, Inc
Copyright © 1994-2022 RustyBrick ®, Inc. Web Development All Rights Reserved.
This work by Search Engine Roundtable is licensed under a Creative Commons Attribution 3.0 United States License. Creative Commons License and YouTube videos under YouTube’s ToS.

source

Leave a Comment

Scroll to Top