Is your robots.txt accidentally blocking Google from crawling your pages?
Find syntax errors, check Googlebot access, and discover missing sitemaps.
Type any domain name. We fetch and parse the robots.txt file automatically.
Syntax, user-agent rules, sitemaps, and potential issues are all checked.
See Googlebot access status, issues found, and the raw file content.
Checks every line of your robots.txt for correct syntax and formatting
See if Googlebot is allowed or blocked from crawling your pages
Detects Sitemap directives so you can verify sitemap discovery
Warns about rules that may accidentally block important pages from indexing
See the full robots.txt file with syntax highlighting in a code block
Identifies common mistakes like blocking everything or missing sitemap references
Your robots.txt file is the first thing search engines read when they visit your site. A misconfigured robots.txt can prevent Google from crawling and indexing your most important pages.
Common problems we see: developers leave "Disallow: /" from staging environments, blocking the entire site. Or they block CSS/JS files, preventing Google from rendering pages properly. These issues can go unnoticed for months while rankings drop.
Your robots.txt should also include a Sitemap directive pointing to your sitemap.xml. This helps search engines discover all your pages faster — especially new content and deep pages.
Once your robots.txt is clean, make sure your pages are actually getting indexed. Use IndexFlow to check indexing status and submit pages that Google hasn't crawled yet.
Robots.txt is a text file placed at the root of your website (e.g. example.com/robots.txt) that tells search engine crawlers which pages they can and cannot access. It uses directives like User-agent, Disallow, Allow, and Sitemap to control crawl behavior. Every major search engine respects robots.txt rules.
Your robots.txt file must be at the root of your domain: https://yourdomain.com/robots.txt. It cannot be in a subdirectory. Each subdomain needs its own robots.txt file. The file must be accessible via HTTP/HTTPS and return a 200 status code.
If no robots.txt file exists, search engines will crawl all accessible pages on your site by default. While this is fine for most small websites, having a robots.txt is recommended to: point crawlers to your sitemap, block admin/login pages, prevent crawling of duplicate content, and manage crawl budget efficiently.
Robots.txt blocks crawling, not indexing. If you use Disallow: / for Googlebot, Google cannot crawl your pages but may still index URLs it discovers through links. To prevent indexing, use the noindex meta tag or X-Robots-Tag header instead. A common mistake is using robots.txt to try to deindex pages — this actually prevents Google from seeing the noindex tag.
The most common mistakes are: using Disallow: / which blocks the entire site, blocking CSS/JS files (prevents Google from rendering pages), forgetting the Sitemap directive, blocking important content directories like /blog, using incorrect syntax (spaces, capitalization), and blocking Googlebot but not other bots. Always test changes before deploying.