robots.txt generator — User-agent, Allow, Disallow, Sitemap (free online)

robots.txt generator — User-agent rules, Allow, Disallow, and Sitemap for SEO

This free online robots.txt generator helps you compose a standards-aligned robots.txt file for crawler control and technical SEO: multiple User-agent groups, Allow and Disallow path prefixes, optional Crawl-delay (where honored), and one or more Sitemap URLs. Everything runs in your browser— copy, download robots.txt, or upload an existing file to refine. After deployment, validate the live file with the robots.txt checker and align discovery with the XML sitemap generator.

Why robots.txt matters for SEO and crawling

Search engines and other bots read /robots.txt before fetching large portions of your site. A clear file reduces accidental crawl budget waste on thin or duplicate paths, points crawlers at your XML sitemaps, and documents intent for teams. It complements—not replaces— meta robots, canonical tags, and redirects. For on-page copy limits, pair this workflow with the meta title and description checker and the readability score checker.

Keywords teams search for— robots.txt disallow all, allow googlebot disallow others, sitemap in robots.txt, and block staging site crawlers—map to explicit User-agent sections and path rules. Use Disallow: / only when you truly want to block compliant generic bots from fetching paths; sensitive data still requires authentication.

How to use this robots.txt generator (step by step)

Start with Allow all or a preset (WordPress starter, Staging: Google only). Each group begins with a User-agent name (use * for all bots).
Add Allow and Disallow rows as path prefixes (for example /admin/). Put more specific exceptions after broader rules when both apply.
Optional: set Crawl-delay per group if a bot you care about reads it—note that Googlebot ignores Crawl-delay in robots.txt.
Add Sitemap URLs (absolute HTTPS). Use Append /sitemap.xml after entering your site origin, or paste multiple sitemap URLs for large indexes and news/video sitemaps.
Copy or Download the preview and place the file at your domain root. Use Upload to load an existing robots.txt, edit in the preview, then re-export.

Guides: Allow vs Disallow, specificity, and sitemaps

Modern crawlers follow longest match / most specific rule semantics for URL path prefixes. An empty Disallow value means no path is disallowed for that group—often used as the minimal “allow everything” pattern. Listing Sitemap directives helps discovery even when URLs are also submitted in Search Console; keep sitemap XML valid and under size limits. For campaign and migration QA, combine robots rules with the redirect type checker and the canonical tag checker.

Internal linking: SEO tools that pair with robots.txt

Use the live robots.txt checker after deploy, build structured data with the schema markup generator, tune social previews via the Open Graph tag generator, and track campaigns with the UTM link builder. Multilingual sites often coordinate hreflang with crawl policy—see the hreflang tag generator.

Related SEO tools in this catalog

Browse the full SEO tools section. Highlights:

Meta Title & Description Checker — Check title and meta description lengths against common search snippet limits before publish.
Keyword Density Checker — Measure keyword frequency, density, and prominence in your page copy for on-page SEO.
Readability Score Checker — Run Flesch-Kincaid style analysis with grades and suggestions for clearer content.
XML Sitemap Generator — Turn a URL list into a standards-compliant XML sitemap for Search Console submission.
Schema Markup Generator — Fill forms to output JSON-LD for articles, FAQs, products, reviews, and more.
Open Graph Tag Generator — Generate Open Graph meta tags and preview social share cards for marketing QA.
Hreflang Tag Generator — Pair URLs with language and region codes to output correct hreflang clusters for multilingual SEO.
Redirect Type Checker — See whether a URL returns 301, 302, or other redirects plus timing for migration audits.
UTM Link Builder — Add UTM parameters for source, medium, campaign, and term to track campaigns in analytics.

Frequently asked questions

What is robots.txt and where does it live?

robots.txt is a plain-text file at the root of your site (for example https://example.com/robots.txt) that tells compliant crawlers which URL paths they may fetch. It is part of the Robots Exclusion Protocol (REP). It does not enforce security—private URLs must be protected with authentication or noindex signals.

Does this robots.txt generator upload anything to your servers?

No. Rules are assembled and previewed entirely in your browser. Copy, download, and upload use local APIs only; uploading reads the file in your tab so you can edit or merge it with the preview.

What is the difference between Allow and Disallow?

Both use URL path prefix matching. Disallow blocks matching paths; Allow can create exceptions when a broader Disallow exists. The most specific rule wins when multiple rules match. An empty Disallow value means “no paths disallowed” for that user-agent group, which effectively allows crawling unless another line blocks.

How many User-agent groups should I use?

Use one group per bot or family you want to treat differently—for example Googlebot versus all other crawlers (*). Keep the file readable: duplicate rules across groups only when policies truly differ. Order matters from top to bottom within a group for matching, and crawlers pick the group that matches their user-agent name.

Should I add Crawl-delay?

Google ignores Crawl-delay in robots.txt. Bing may honor crawl-delay directives in some contexts. Use it only if you have verified a bot reads it and you need to reduce load; otherwise prefer server-side rate limiting or Search Console crawl settings for Google.

Where should Sitemap lines go?

You can list one or more Sitemap directives pointing to XML sitemap URLs. They are often placed at the end of the file but are valid anywhere. Ensure URLs are absolute (https://…) and return 200 with valid XML.

Will robots.txt remove pages from Google’s index?

Blocking a URL in robots.txt prevents crawling, not necessarily indexing—Google may still show a URL with limited snippet if it finds links. To remove or prevent indexing, use noindex, authentication, or URL removal tools alongside robots rules.

How do I test my file before launch?

Use Google Search Console’s robots.txt Tester (legacy in some views) or fetch the live URL after deploy. You can also paste rules into a checker workflow and compare with your XML sitemap and canonical strategy.

Can I block AI crawlers?

Some crawlers expose dedicated user-agents (for example GPTBot). Add a User-agent group for that name with Disallow rules if the bot honors robots.txt. Policies change—confirm the vendor’s current user-agent string and documentation.