Majestic

  • Site Explorer
    • Majestic
    • Zusammenfassung
    • Verw. Domains
    • Backlinks
    • * Neu
    • * Verloren
    • Kontext
    • Linktext
    • Seiten
    • Themen
    • Link Graph
    • Verwandte Websites
    • Weiterführende Tools
    • Author ExplorerBeta
    • Summary
    • Similar Profiles
    • Profile Backlinks
    • Attributions
  • Vergleichen
    • Zusammenfassung
    • Backlink Historie
    • Flow Metric Historie
    • Themen
    • Clique Hunter
  • Link Tools
    • Mein Majestic
    • Letzte Aktivität
    • Reports
    • Kampagnen
    • Verifizierte Domains
    • OpenApps
    • API-Schlüssel
    • Keywords
    • N-grams Near Links
    • Keyword Checker
    • Search Explorer
    • Link Tools
    • Bulk Backlinks
    • Neighbourhood Checker
    • URLs übermitteln
    • Experimentell
    • Indexzusammenführung
    • Link Profile Fight
    • Gegenseitige Links
    • Solo Links
    • PDF-Report
    • Typo Domain
    • TLD Checker Neu
  • Free SEO Tools
    • Legen Sie los
    • Backlink Checker
    • Majestic Million
    • Majestic-Plugins
    • Google Sheets
    • Post Popularity
    • Social Explorer
  • Support
    • Blog Externer Link
    • Support
    • Legen Sie los
    • Tools
    • Subscriptions & Billing
    • FAQs
    • Glossar
    • Styleguide
    • Anleitungsvideos
    • API-Referenzhandbuch Externer Link
    • Kontakt
    • Über Backlinks und SEO
    • SEO in 2026
    • The Majestic SEO Podcast
    • All Podcasts
    • What is Trust Flow?
    • Link-Building-Anleitung
  • KOSTENLOS anmelden
  • Pläne und Preise
  • Login
  • Language flag icon
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
    • 日本語
    • Nederlands
    • Polski
    • Português
    • 中文
  • Legen Sie los
  • Login
  • Pläne und Preise
  • KOSTENLOS anmelden
    • Zusammenfassung
    • Verw. Domains
    • Karte
    • Backlinks
    • Neu
    • Verloren
    • Kontext
    • Linktext
    • Seiten
    • Themen
    • Link Graph
    • Verwandte Websites
    • Weiterführende Tools
    • Zusammenfassung
      Pro
    • Backlink Historie
      Pro
    • Flow Metric Historie
      Pro
    • Themen
      Pro
    • Clique Hunter
      Pro
  • Bulk Backlinks
    • N-grams Near Links
    • Keyword Checker
    • Search Explorer
      Pro
  • Neighbourhood Checker
    Pro
    • Indexzusammenführung
      Pro
    • Link Profile Fight
      Pro
    • Gegenseitige Links
      Pro
    • Solo Links
      Pro
    • PDF-Report
      Pro
    • Typo Domain
      Pro
    • TLD Checker Neu
      Pro
  • URLs übermitteln
    • Summary
      Pro
    • Similar Profiles
      Pro
    • Profile Backlinks
      Pro
    • Attributions
      Pro
  • Custom Reports
    Pro
    • Legen Sie los
    • Backlink Checker
    • Majestic Million
    • Majestic-Plugins
    • Google Sheets
    • Post Popularity
    • Social Explorer
    • Legen Sie los
    • Tools
    • Subscriptions & Billing
    • FAQs
    • Glossar
    • Anleitungsvideos
    • API-Referenzhandbuch Externer Link
    • Kontakt
    • Meldungen
    • The Company
    • Styleguide
    • Nutzungsbedingungen
    • Datenschutzrichtlinie
    • DSGVO
    • Kontakt
    • SEO in 2026
    • The Majestic SEO Podcast
    • All Podcasts
    • What is Trust Flow?
    • Link-Building-Anleitung
  • Blog Externer Link
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
    • 日本語
    • Nederlands
    • Polski
    • Português
    • 中文

How to deal with index bloat

Marcus Tandler

Marcus highlights the importance of managing the content that you want Google to discover - and getting rid of the content that doesn't cut the mustard.

@mediadonis  
Marcus Tandler 2022 podcast cover with logo
More SEO in 2022 YouTube Podcast Playlist Link Spotify Podcast Playlist Link Audible Podcast Playlist Link Apple Podcast Playlist Link

How to deal with index bloat

Marcus says: "Dealing with index bloat is very actionable. A preventive Panda diet is a common recommendation for websites that have gained a lot of fat throughout the years and are now struggling to increase visibility in Google. A large number of irrelevant pages is the equivalent of empty yoghurt containers being kept and stored because you might still need them in the future. Don't get carried away with this kind of digital compulsive hoarding. Instead, keep your website fresh, tidy, and clean.

You should always ask yourself three simple questions for all your pages: Do I need this page for my users? Does this page also need to be indexed by Google? And if yes, what should this page ideally rank for? Of course, you don't need to delete all pages which are not relevant for ranking purposes, and still have a value for your users. Users can still navigate to these pages but be more rigorous about what should get indexed.

For example, you might have a faceted navigation for an online shop for shoes, and the targeted keyword is 'Adidas Superstar'. These trainers are available in ten different colours and ten different sizes, so an interested user can navigate to any variation of these 100 different pages to put them in their virtual shopping basket. However, you don't necessarily want to let those 100 product variations get indexed - only the ones which are explicitly being searched for.

You might end up with a catch-all 'Adidas Superstar' page with displayed trainers in the best-selling colour and size combination. If there's a sufficient amount of Google searches for a specific combination, such as 'black Adidas Superstars', it also makes sense to grant this option its very own indexable page. This way, you can ensure a good ratio of relevant to irrelevant pages, reduce the possibility of cannibalization, and therefore stay safe from Google's Panda.

If your website is already full of these empty yoghurt containers, it's advisable to preventively take out the trash. Ideally, using code 410 Gone instead of the more common 404s, since Google is revisiting 404 pages a couple more times. While with a 410 status code, Google will only revisit this page one more time to make sure the page is really gone. Of course, you will also need to remove any internal links and or references to these 410 pages.

If you have amassed loads of these empty yoghurt containers, which are now 410, and you want to get those out of the index as fast as possible, you can consider using a negative sitemap, which contains only pages that are 410. This way, you can remove large amounts of irrelevant pages out of Google's index very effectively."

How does an SEO find these empty yoghurt containers on their website and define what is useful for users, useful for Google, and a good page to rank for?

"Start by taking Google Search Console data and filter for all pages with clicks=0 within the 12 months timeframe. Then you take this list to Google Analytics and see if those pages are getting traffic besides from organic search. Ideally, as a third step, you should also compare this list with Googlebot behaviour using LockFile data to see if Google is still regularly crawling these pages. Since the scheduler which sends off Googlebot to fetch pages prioritises by importance, you can get a good sense of whether Google is finding any of these pages still important - and therefore relevant. If any of these pages receive no organic traffic, or any other natural or paid traffic, and if Google isn't even crawling these anymore, you can just get rid of them to thin out your website easily."

What is a good ratio of useful pages to these irrelevant pages?

"This always depends on your website. There isn't an ideal ratio you should aim for. If something's relevant - it's relevant. It's about sending out everything which is not relevant. Just let the data talk."

How do you define what a user is likely to like? Is it based upon user behaviour, or do you have to have a group of people analyse that page to ensure it's a relevant and appropriate part of the buying cycle?

"What you're referring to is the long click, and this is Google's main goal. They want somebody to click through to your page and stay on your site - not just be a one-hit-wonder on one page. They want you to surf through to your other pages on your site and not go back to the SERP.

A search completion tells Google that the page really fulfilled the intent. But you've got to start way earlier also with pages, which Google basically promotes in the top ten. To see if users find this page relevant, and before you can even evaluate the long click behaviour, you need to understand if they are clicking through to your site - this is the first sign of relevance. You might think you have the perfect page, and you're getting into the top ten, but if nobody's clicking through to your site, it is a clear indication there might be a different intent here. Your content may be great but it's just not great content for this intent.

You need to be thinking of how you can make the best possible page to fulfil this intent. The most important aspect of modern SEO is not just aspiring to rank in the top ten but being the best possible result for that specific query."

What is the most common cause of index bloat?

"The worst is always online shops. They have all these products going in and out of stock, multiple categories, and tag pages. Just using an out-of-the-box CMS might create index bloat problems by introducing categories, as well as tags. And this can happen even to a blog or any site."

You mentioned having a negative sitemap for 410s, but also removing all links to these pages. If you're struggling to remove so many different links from your website, is it enough to redirect the links to something else?

"If I have an out-of-stock product, the standard recommendation is to redirect users to the corresponding category, so they still have an inventory to choose from. I don't think this will help you in the long run because it's just not as relevant as other online shops with the product in stock. It really won't help you with ranking per se. Instead, I'd just get rid of it.

This is exactly the empty yoghurt container problem. You're keeping something because you might need it in the future. It's better to have a crisp, clean, and compact site. The negative sitemap is speeding everything up and giving Google an indication that this is everything you want out. If you keep these links, Google might be inclined to go to these pages, so I'd remove all references."

If you're training other marketing professionals on the value of dealing with index bloat, would you typically use your empty yoghurt containers analogy?

"I'm always explaining it this way. We tend to keep a lot of stuff on our websites because we think we might still need it. I think SEOs have created this problem. Fifteen years ago, if somebody came to us and said, my site is 1000 pages, what should I do? A lot of SEOs (me included) would have recommended making it 100,000 pages for every possible keyword combination. That's what you needed back then because Google wasn't that good.

Now Google is much smarter, you don't need this anymore. You are also bleeding your link juice because you've distributed to so many pages. These days, it's much more advisable to have a more compact structure. Also, Google had a different objective back then. If you remember, Google was always promoting how many billion pages it has indexed because it was in a rat race with Microsoft and Yahoo.

It's not about the quantity anymore. With social media, they simply can't keep up with the volume of URLs created every day. So now the focus is on quality - and this is why Panda came out."

Suppose an SEO is involved in the original design of a website. Is it better for blog posts and product pages just to have a single category and for any tags associated with each of those pages to be completely non-indexable by Google?

"I would always opt for one indexable structure with categories, but then I would opt out of tags or a structure by dates, such as months. If you also use these things to maximise the internal links to all of your pages, you basically end up linking all over the place, and you don't direct Google to what's really important for you.

As an SEO, I always want to be in control as much as possible. I don't want Google to have to do lots of heavy lifting. I want a structure that tells them what I'm good at and where I provide the most value to the users."

What's one thing an SEO needs to stop doing to spend more time focusing on index bloat?

"Focusing a lot of the time on link building - especially if you are already a big brand with lots of links, and Google already likes you. Any new links are unlikely to have maximum benefit if you have a sub-optimal structure. You don't build a house in the swamp - you need a solid foundation. So always fix the structure before spending a lot of money and time on acquiring new links. With a great structure, these links will have so much more power."

You can find Marcus Tandler over at Ryte.com.

@mediadonis  

Also with Marcus Tandler

Majestic SEO Podcast - the Majestic SEO podcast cover
Majestic SEO Podcast
#76: How to Speak at SEO Conferences
Joining David Bain to discuss "How to Speak at SEO Sonferences" is Gemma Houghton, Katarina Dahlin, Sophie Logan, Marcus Tandler and Adrijana Vujadin.
Marcus Tandler 2023 podcast cover with logo
2023 Additional Insight
Regularly check your most important SERP Snippets
Marcus Tandler suggests regularly checking important keyword SERP snippets and comparing them with competitors to help ensure they are correctly displayed in order to improve visibility.

Choose Your Own Learning Style

Webinar iconVideo

If you like to get up-close with your favourite SEO experts, these one-to-one interviews might just be for you.

Watch all of our episodes, FREE, on our dedicated SEO in 2022 playlist.

youtube Playlist Icon

Podcast iconPodcast

Maybe you are more of a listener than a watcher, or prefer to learn while you commute.

SEO in 2022 is available now via all the usual podcast platforms

Spotify Apple Podcasts Audible

Book iconBook

This is our favourite. Sometimes it's better to sit and relax with a nice book.

The best of our range of interviews is available right now as a physical copy and eBook.

Amazon US Amazon UK

Don't miss out

Opt-in to receive email updates.

It's the fastest way to find out more about SEO in 2026.


Könnten wir diese Seite für Sie verbessern? Bitte sagen Sie uns Ihre Meinung.

Aktueller Index

Einzelne URLs gecrawlt 233.046.234.538
Einzelne URLs gefunden 796.509.715.375
Datumsbereich 08 Jan. 2026 - 09 Mai 2026
Zuletzt aktualisiert vor 41 Minuten

Historischer Index

Einzelne URLs gecrawlt 4.502.566.935.407
Einzelne URLs gefunden 21.743.308.221.308
Datumsbereich 06 Juni 2006 - 26 März 2024
Zuletzt aktualisiert 03 Mai 2024

SOCIAL MEDIA

  • LinkedIn
  • YouTube
  • Facebook
  • Bluesky
  • Twitter

UNTERNEHMEN

  • Blog Externer Link
  • Über uns
  • Nutzungsbedingungen
  • Datenschutzrichtlinie
  • DSGVO
  • Kontakt

TOOLS

  • Pläne und Preise
  • Site Explorer
  • Domains vergleichen
  • Bulk Backlinks
  • Search Explorer
  • Entwickler-API Externer Link

MAJESTIC FÜR

  • Trust Flow
  • Flow Metric-Wertungen
  • Link Context
  • Backlink Checker
  • Entdeckung von Influencern
  • Unternehmen Externer Link

PODCASTS & PUBLICATIONS

  • The Majestic SEO Podcast
  • SEO in 2026
  • SEO in 2025
  • SEO im Jahr 2024
  • SEO im Jahr 2023
  • SEO im Jahr 2022
  • All Podcasts
top ^