GEO Robots.txt Generator | AI Crawler Configuration Tool

GEO Robots.txt Generator

Configure AI crawler access for optimal GEO visibility

📚 Educational Tool

This generates the AI crawler section of your robots.txt. Append to your existing file. Always test in staging first.

📖 Key Concepts Explained

Allow — Full access. The crawler can visit all pages (except blocked paths) at any speed. Best for maximum visibility.

Rate-Limit — Restricted speed. The crawler can visit pages but must wait X seconds between requests (Crawl-delay). Use for training crawlers that consume server resources.

Block — No access. The crawler cannot visit any pages. You become invisible to that AI system. Use for aggressive or unwanted crawlers.

Crawl-delay — Seconds the crawler must wait between page requests. Higher values = less server load but slower indexing. Typical: 2-10s for search crawlers, 10-30s for training crawlers.

User-agent: * — A wildcard rule that applies to ALL crawlers not specifically listed. Acts as a "catch-all" fallback.

RAG/Search crawlers — Fetch pages in real-time when users ask questions. Direct citation value. Prioritize allowing these.

Training crawlers — Collect data to train future AI models. High server load, indirect long-term value. Consider rate-limiting.

1. Select Your Strategy

🚀

Maximize Visibility

Allow all crawlers for maximum AI citations

⚖️

Balanced

Allow search, rate-limit training crawlers

🛡️

Conservative

Search only, block training crawlers

2. Configure AI Crawlers

RAG/Search Real-time queries — direct citation value

Training Model training — future authority

Index Search index building

Search+AI Both search and AI features

3. Paths to Block ?

Pages that should NOT be crawled by AI systems (applied to all non-blocked crawlers)

Shopping Cart/cart/ Checkout/checkout/ User Accounts/account/ Admin Area/admin/ Internal Search/search/ API Endpoints/api/ Private/Internal/private/ WordPress Admin/wp-admin/

Custom Paths to Block

Comma-separated paths. Include trailing slash.

4. Additional Options

Sitemap URL ? Helps crawlers find important pages

Include General Fallback (*) ? Rule for unlisted crawlers

⚠️ Compliance Note

Some crawlers (notably Perplexity) use stealth crawlers that bypass robots.txt. If blocking doesn't work, implement IP-level firewall rules.

Generated robots.txt

0 Allowed

0 Rate-Limited

0 Blocked

✍️ Strategy & Content

⚙️ Technical & Implementation

📊 Measurement & Monitoring

GEO Robots.txt Generator

1. Select Your Strategy

2. Configure AI Crawlers

3. Paths to Block ?

4. Additional Options

Generated robots.txt

✍️ Strategy & Content

⚙️ Technical & Implementation

📊 Measurement & Monitoring

1. Select Your Strategy

2. Configure AI Crawlers

3. Paths to Block ?

4. Additional Options

Generated robots.txt 📋 Copy to Clipboard

Generated robots.txt