GSC Page Indexing Report Checklist: How to Audit and Fix Every Indexing Issue

Most SEO problems don’t start with bad content or weak backlinks. They start with pages Google simply refuses to index. The GSC Page Indexing report is where those problems surface — and if you’re not working through it systematically, you’re leaving compounding organic equity on the table.

Table of Contents

This checklist maps every status in Google Search Console’s Page Indexing report to a diagnostic action and a fix. Whether you’re running a site audit for a client or doing a quarterly health check on your own property, use this as your operational framework.

What the GSC Page Indexing Report Actually Shows

The Page Indexing report (found under Indexing → Pages in GSC) categorizes every URL Google has encountered into two buckets: indexed and not indexed. Within the not-indexed bucket, Google gives you specific reasons — and those reasons are the entire point.

Each status in this report is a signal from Google’s systems about how it perceives your site’s crawlability, content quality, and information architecture. Treating these statuses as individual bugs to patch, rather than symptoms of broader structural issues, is the fastest way to waste time.

The report is most valuable for sites with 500+ pages. For smaller sites, spot-check key URLs with the URL Inspection tool first — if important pages aren’t appearing in site:yourdomain.com searches, then open the full report.

The Full GSC Page Indexing Checklist

Work through this checklist in order. Crawl and server issues must be diagnosed before content quality problems — there’s no point improving a page Google can’t reach.

1. Search Console Indexing Summary

Start here. Before diving into individual statuses, get the macro picture.

Open the Pages report and record the total indexed vs. not-indexed count
Calculate what percentage of your site’s pages are not indexed — if more than 5% of pages you want indexed are excluded, treat this as a site-wide signal, not a page-level problem
Review the trend graph: a declining indexed count or a rising not-indexed count needs immediate investigation
Cross-reference the indexed count against a site:yourdomain.com search to validate the numbers

2. Server Errors (5XX)

Priority: Critical. Server errors during Googlebot’s crawl window actively suppress crawl frequency. A spike in 5xx responses causes Googlebot to throttle visits across the entire domain, which cascades into a growing “Discovered — Currently Not Indexed” backlog.

How to diagnose:

Go to Settings → Crawl Stats → By Response in GSC and look for any percentage of 5xx responses
Check if spikes in average server response time correlate with drops in total crawl requests — this inverse pattern confirms crawl throttling
Review your server logs for Googlebot-specific 5xx patterns that may not appear in aggregated GSC data

Fix: Resolve hosting or infrastructure issues causing server strain. Even sporadic 5xx errors during Googlebot’s visit windows can trigger aggressive crawl rate reduction.

3. Redirect Errors

Redirect errors in the Page Indexing report indicate chains or loops that Googlebot couldn’t resolve. Every redirect hop costs crawl budget. Chains longer than two hops should be resolved to direct 301s.

Identify all redirect chains using a crawl tool
Check that sitemaps contain only final destination URLs — not intermediate redirect steps
Verify no redirect loops exist (URL A → B → A)

4. Blocked by Robots.txt

This status means Googlebot tried to access the URL and was blocked by your robots.txt file. This is frequently the result of misconfigured rules during site migrations or CMS updates.

Verify the robots.txt file at yourdomain.com/robots.txt
Use the Robots.txt Tester in GSC (Settings → Robots.txt) to validate rules against your key URLs
Check whether any important page templates (product pages, category pages, blog posts) are accidentally disallowed
Note the separate status: “Indexed though blocked by robots.txt” — this means Google indexed the page despite a block signal, usually because it was linked from an external source. Decide whether to remove the block or apply a noindex

5. URL Marked Noindex

A noindex directive is an explicit instruction to Google not to include the page in the index. When this appears on URLs that should be indexed, it’s almost always a CMS setting, plugin configuration, or code deployment error.

Use the URL Inspection tool to confirm the noindex source (meta robots tag vs. X-Robots-Tag HTTP header)
Check your CMS’s global settings — WordPress, Shopify, and Wix all have site-wide indexing toggles that can accidentally flip during updates
Audit your SEO plugin settings for accidental noindex rules applied to taxonomies, archives, or post types
X-Robots-Tag headers applied server-side are particularly dangerous: they don’t appear in page source and are invisible unless you check HTTP response headers directly in your browser’s DevTools

6. Soft 404

A soft 404 occurs when a page returns a 200 (OK) HTTP status code but displays no meaningful content — an empty search results page, a “no products found” message, or a template with missing data. Google treats these as poor-quality signals.

Review each URL in this category and assess whether it contains substantive content
Pages with no meaningful content should either be 301 redirected to a relevant page or return a proper 404 status code
For dynamic pages (filtered categories, search results), evaluate whether any should be in the index at all — most filtered URLs have no standalone search value

7. Blocked Due to Unauthorized Request (401)

Pages returning 401 errors require authentication that Googlebot cannot provide. These URLs should not be in your sitemap.

Remove all gated/authenticated URLs from your XML sitemap
If the pages should be publicly accessible, investigate the authentication layer (CDN rules, .htaccess, middleware) that’s causing the 401

8. Not Found (404)

404s in the Page Indexing report represent URLs Google has encountered but found to be dead ends. A controlled number of 404s is normal. A spike indicates a structural problem.

Identify the source of the 404 URLs — are they linked internally, externally, or in a sitemap?
Remove dead URLs from XML sitemaps immediately
For important pages that were moved or deleted, implement 301 redirects to the most relevant live equivalent
Clean up internal links pointing to 404 destinations

9. Blocked Due to Access Forbidden (403)

403 errors indicate the server is actively refusing Googlebot access. Common causes include IP-based blocking rules or misconfigured CDN/firewall settings that mistakenly flag Googlebot as a bot to block.

Review CDN and firewall rules to confirm Googlebot’s IP ranges are not being blocked
Check server-level access controls (.htaccess, Nginx config)

10. URL Blocked by Other 4XX Issue

This catch-all covers 4xx errors outside the common 401, 403, and 404 statuses. These are usually application-level errors.

Use the URL Inspection tool on affected URLs to get the specific response code
Investigate at the application layer — routing errors, API gateway misconfiguration, and legacy redirect rules are common culprits

11. Blocked by Page Removal Tool

This status means someone used GSC’s URL Removal Tool to temporarily suppress these URLs from search. Page removal requests expire after 90 days — after which, pages will re-appear in search results unless a permanent signal (noindex or 301) has been implemented.

Audit all active removal requests in GSC (Removals section)
Confirm whether each suppression was intentional
For pages that should permanently stay out of the index, implement noindex or remove the content entirely

12. Crawled — Currently Not Indexed

This is one of the most diagnostically significant statuses. Google visited the page, evaluated it, and decided it wasn’t worth indexing. The decision happened after the crawl, which means the problem is content quality or relevance signals — not crawlability.

Common causes:

Thin content: Pages with minimal unique information relative to other pages on the site or SERPs
Duplicate content: Pages with high content overlap with other URLs — Google filters these to improve search quality
Poor internal linking: Pages with no or few internal links lack the contextual signals Google uses to assess value
Search intent mismatch: Pages that don’t match any coherent query intent

Fix protocol:

Identify whether the affected pages serve a genuine search intent that isn’t already satisfied by another URL on your site
If pages are thin: expand content depth or consolidate with a stronger page via 301
If duplicate: implement canonical tags pointing to the authoritative version; ensure sitemaps only include canonical URLs
Strengthen internal linking to affected pages from topically relevant, well-indexed pages
After fixing: use the URL Inspection tool to request indexing, then track status weekly — expect 2–4 weeks for changes to take effect

If more than 5% of your target pages carry this status, treat it as a topical authority and information architecture problem, not a page-level fix.

13. Discovered — Currently Not Indexed

Here, Google knows the URL exists but hasn’t crawled it. The distinction from “Crawled — Currently Not Indexed” is important: Google chose not to expend crawl budget visiting these pages. That’s a stronger negative signal.

For sites under 10,000 pages, this status often resolves automatically as Google works through its queue — but if it persists for new content, investigate:

Crawl budget exhaustion: A large volume of low-quality, duplicate, or thin pages is consuming Googlebot’s allowed connections. Sites with 10,000+ pages can lose up to 30% of crawl coverage to URLs that shouldn’t be indexed.
Poor internal link structure: Google isn’t finding enough entry points to these pages. Internal links from well-indexed, high-authority pages accelerate discovery.
Content overload: Publishing velocity outpacing crawl budget — reduce content output and prioritize quality over volume

Fix protocol:

Audit and prune thin, duplicate, or low-value content to reclaim crawl budget
Improve internal linking from indexed, high-traffic pages to the affected URLs
Review Crawl Stats for server response time patterns that indicate Googlebot backing off

14. Alternate Page with Proper Canonical Tag

This is not an error. These pages have been intentionally canonicalized to another URL, and Google has accepted that signal. No action is required unless you believe the wrong URL has been designated as canonical.

Review a sample to confirm the canonical targets are the intended preferred URLs
If a canonical is pointing to a 404 or redirect, fix the canonical tag

15. Duplicate Without User-Selected Canonical

Google found duplicate or near-duplicate content across multiple URLs, but no canonical tag was implemented to indicate the preferred version. Google is deciding for you — and it may not choose the URL you’d prefer.

Implement canonical tags on all duplicate URL groups pointing to the single preferred version
Ensure XML sitemaps include only the canonical URL, not the duplicates
Review URL parameter handling in GSC if faceted navigation or session IDs are generating duplicate URLs

16. Duplicate — Google Chose Different Canonical Than User

This is a direct conflict signal: you’ve set a canonical tag, but Google has overridden it and indexed a different URL as authoritative. Google treats canonical tags as hints, not directives.

Showing 4–5 of 5 resultsSorted by popularity

When Google overrides your canonical, it usually means:

The page you designated as canonical has weaker signals than the page Google preferred (fewer internal links, lower content quality, less external authority)
There’s a structural inconsistency — the non-canonical version receives more internal links or external backlinks than the canonical version

Fix: Align all signals toward the intended canonical URL — internal links, sitemap inclusion, external link pointing, and content depth should all favor the canonical version.

17. Page with Redirect

Pages flagged here were crawled and found to redirect to another URL. These shouldn’t appear in your sitemap or be linked internally.

Remove redirecting URLs from XML sitemaps
Update all internal links to point directly to the final destination URL
Audit for redirect chains that can be collapsed into a single hop

18. Indexed, Though Blocked by Robots.txt

As noted above: Google indexed this page despite a robots.txt block, almost always because an external site linked to it. Decide: should this page be indexed?

If yes: remove the robots.txt disallow rule
If no: add a noindex directive (robots.txt alone is not a reliable indexing block — the noindex meta tag or X-Robots-Tag header is required)

19. Page Indexed Without Content

Google indexed the page, but found little or no meaningful content during rendering. This can happen when content is loaded via JavaScript that Googlebot fails to render.

Use the URL Inspection tool’s Live Test to see what Google actually renders — compare the rendered snapshot to what a human sees in a browser
If JavaScript content isn’t rendering: evaluate server-side rendering (SSR) or pre-rendering as an alternative
Pages with genuinely empty content should be noindexed or removed

Crawl Stats Review

The Crawl Stats report (Settings → Crawl Stats) provides server-level data that the Page Indexing report doesn’t surface directly. Review:

Average response time: High response times cause Googlebot to throttle crawl rate, directly creating “Discovered — Currently Not Indexed” backlogs
Crawl requests by response: Monitor the percentage of 5xx responses — even a small share of server errors during Googlebot visits triggers crawl rate reduction
Host status: Review the host status panel for DNS, robots.txt, or server connectivity issues

URL Inspection and Render Evaluation

For any URL that appears in a concerning status, validate with the URL Inspection tool before acting:

Inspect URL → Live Test: Shows what Google currently sees when it crawls the page, including rendered JavaScript output
Cache Snapshot: Compare the last cached version to the current live state — discrepancies indicate recent changes that haven’t been re-crawled
Resource Accessibility: Confirm that all resources (CSS, JavaScript, images) required to render the page are accessible to Googlebot and not blocked by robots.txt

Sample at least 10–20 URLs from each problem status category before drawing conclusions about root cause.

Security and Manual Actions

Before or alongside indexing work, check:

Manual Actions (Security & Manual Actions → Manual Actions): Any manual penalty from Google’s webspam team will suppress affected URLs from rankings or remove them entirely. Manual actions require submitting a reconsideration request after the issue is fixed.
Security Issues (Security & Manual Actions → Security Issues): Hack-based spam injections can flood your index with spammy pages, dilute topical authority, and contaminate your site’s relationship with Google. Check for URL structures you don’t recognize.

For spam and hack detection, also run a manual site:yourdomain.com search and look for URLs that don’t belong to your site’s content structure.

Enhancements: Structured Data

The Enhancements section reports on structured data implementation errors. While structured data doesn’t directly determine whether a page is indexed, errors in schema markup create ambiguity about page purpose and reduce eligibility for rich results.

Review errors for:

Breadcrumbs: Critical for site architecture legibility in search results
FAQ schema: Affects rich result eligibility (note: FAQ rich results have reduced display frequency since 2023, but valid schema still supports entity-based optimization)
Videos: Errors here prevent video carousels from appearing
Unparsable Structured Data: Malformed JSON-LD that Google cannot interpret — fix syntax errors using Google’s Rich Results Test

Only 17% of the top 10 million websites implement any form of schema markup, which means correct structured data implementation remains a measurable competitive advantage.

Frequently Asked Questions

Q: How often should I review the GSC Page Indexing report? Weekly monitoring is the appropriate cadence for most sites — specifically checking for new errors or spikes in excluded pages. After any significant site change (migration, redesign, CMS update, bulk content publish), check the report within 48–72 hours. Monthly deep audits using this full checklist are appropriate for sites with 1,000+ pages.

Q: What’s the difference between “Crawled — Currently Not Indexed” and “Discovered — Currently Not Indexed”? The distinction is where in the process Google stopped. “Discovered” means Google knows the URL exists but chose not to spend crawl budget visiting it — a stronger signal of perceived low value or crawl budget exhaustion. “Crawled” means Google visited the page, rendered it, evaluated the content, and then decided it wasn’t worth indexing. Both statuses indicate a content quality or site architecture problem, but “Discovered” also requires investigating crawl budget and server performance.

Q: My canonical tags are set correctly, but Google keeps choosing a different canonical. Why? Google treats canonical tags as hints, not mandatory instructions. When Google overrides a canonical, it means the structural signals on your site — internal links, sitemap inclusion, external backlinks — are pointing more strongly toward a different URL than the one you’ve designated. Align all of these signals toward your preferred canonical to increase the likelihood Google accepts it.

Q: Can fixing indexing issues directly improve rankings? Pages that aren’t indexed cannot rank for anything, so resolving indexing errors is a prerequisite for organic visibility, not a direct ranking factor. Fixing indexing issues ensures that pages with strong content and backlink signals can actually appear in search results. Pages that were previously “Crawled — Currently Not Indexed” due to thin content won’t automatically rank well after indexing — content quality and search intent alignment still determine ranking position.

Q: How long does it take for fixed indexing issues to be reflected in GSC? Expect 2–4 weeks after implementing fixes and requesting re-indexing via the URL Inspection tool. Validation through GSC’s “Validate Fix” function typically takes up to two weeks to complete. GSC reporting itself can lag behind actual crawl behavior — always use the URL Inspection tool’s Live Test to get the current real-time status of a specific page, rather than relying solely on the aggregated report counts.

Next Steps

Run this checklist against your site’s current Page Indexing report and prioritize fixes in order of crawl severity: server errors and access blocks first, content quality issues second, canonical conflicts third. If you’re finding large volumes of “Discovered — Currently Not Indexed” pages, that’s the signal to start a full content audit before fixing individual URLs.

For sites dealing with persistent indexing issues despite clean technical fundamentals, the root cause is almost always information architecture — too many thin pages diluting crawl budget, or a site structure that doesn’t signal topical authority clearly enough to earn Googlebot’s attention.

SEO Content Audit

Search Rankings and Traffic Losses Audit

Full-Scale Professional SEO Audit

What the GSC Page Indexing Report Actually Shows