Skip to main content
Sprout SEO
Version 4.4.6 Bug fixes

4.4.6 – Crawl by the Rules

June 17, 2026

Bug fixes

  • Fixed: The robots.txt checker could report a URL as ALLOWED while Google Search Console showed it blocked by robots.txt. The parser now follows how Google actually reads robots.txt (RFC 9309): consecutive User-agent lines grouped with only ignored directives between them (like Crawl-delay) stay in one group, so a trailing Disallow applies to the right crawler — including *. Thanks to Kevin Vandeplassche for flagging this on LinkedIn.
  • Fixed: Rules that target query strings (e.g. Disallow: /*?, /*sid=, /*.php?) were never matched because we only checked the path. Matching now includes the query string, same as Google.
  • Fixed: Wildcard patterns (* and $) and longest-match precedence were not applied correctly. Conflicting Allow and Disallow rules now resolve the way Google documents: longest match wins; on a tie, Allow beats Disallow.
  • Fixed: Googlebot-specific groups were ignored when a generic * group existed. A dedicated Googlebot block can now override *, and an empty Googlebot group (crawl-delay only) correctly means crawlable even when * disallows everything.
  • Fixed: A missing robots.txt (HTTP 404) was shown as “NOT FOUND” instead of treated as no restrictions. 404 and other 4xx responses (except 429) now correctly mean crawling is allowed; 429 and 5xx show an unavailable warning, matching Googlebot behavior.
  • Fixed: Switching tabs or URLs quickly could leave the robots.txt status showing the previous page. A race guard ensures only the latest fetch updates the UI.
  • Fixed: Network and unexpected errors were lumped together as “NOT FOUND”. CORS/DNS failures are now distinguished from server errors, and the copy-robots button no longer serves stale content from another page after an error.

Improvements

  • Clearer status details — Blocked/allowed subtitles now show the HTTP status, which user-agent group matched (Googlebot, with fallback to *), and the specific rule that decided the verdict.
  • Rules preview — The matched group’s Allow and Disallow lines are both shown, so an allowed path no longer looks blocked in the preview. Values are safely escaped in the UI.
  • Sitemaps — Relative or malformed Sitemap: entries are no longer silently dropped. They appear with an orange warning and an RFC 9309 reference; valid absolute URLs still link through as before.
  • AI bots section — Uses the same parsed robots.txt as the main checker (one parse, consistent results across the Page Info tab).
  • Translations — New and corrected robots.txt status strings across all seven locales, including previously missing “not found” / “not accessible” labels and proper BLOCKED/ALLOWED wording in Spanish and Dutch.