How to Block Specific Bots from Crawling Your Site Using Robots.txt

The robots.txt file is a text file placed in your website’s root directory. It instructs web crawlers which pages or directories they can or cannot access.

Why Block Specific Bots?

Not all web crawlers are beneficial. Some bots consume server resources, scrape content, or skew analytics. Blocking unwanted bots improves site performance, security, and SEO accuracy.

Identifying Bots to Block

Check your server logs or analytics tools to identify bots. Common unwanted bots include:

AhrefsBot
SemrushBot
MJ12bot

Syntax for Blocking Bots in robots.txt

Use the User-agent directive to target specific bots and Disallow to restrict access.

Block a Single Bot

User-agent: BadBot
Disallow: /

Block Multiple Bots

User-agent: Bot1
Disallow: /

User-agent: Bot2
Disallow: /

Wildcards

User-agent: *
Disallow: /private/

The * wildcard applies rules to all bots.

Common Mistakes to Avoid

Typos: Ensure correct spelling of User-agent and bot names.
Incorrect Paths: Double-check directories in Disallow.
Conflicting Directives: Avoid mixing Allow and Disallow haphazardly.

Testing Your robots.txt

Use Google Search Console’s Robots Testing Tool to validate your rules. Check server logs to confirm bots are respecting the file.

Advanced Tips

Crawl-Delay: Add Crawl-delay: 10 to slow down frequent crawlers (not all bots support this).
IP Blocking: Use .htaccess or firewall rules for aggressive bots.

Conclusion

robots.txt is a simple yet powerful tool to control bot access. Regularly audit and test your file to ensure compliance. Note that malicious bots may ignore these rules, so combine with security measures for full protection.

Robots.txt SEO

How to Block Specific Bots from Crawling Your Site Using Robots.txt

Why Block Specific Bots?

Identifying Bots to Block

Syntax for Blocking Bots in robots.txt

Block a Single Bot

Block Multiple Bots

Wildcards

Common Mistakes to Avoid

Testing Your robots.txt

Advanced Tips

Conclusion

Robots.txt SEO: Understanding the Use of Robots.txt in Technical SEO

2025 » Fix Indexed Though Blocked by Robots.txt

2025 ▷ Fix Failed: Robots.txt unreachable

What is Crawl Delay and How to Use It Effectively

New Robots.txt Report in GSC

How to Block Specific Bots from Crawling Your Site Using Robots.txt

Why Block Specific Bots?

Identifying Bots to Block

Syntax for Blocking Bots in robots.txt

Block a Single Bot

Block Multiple Bots

Wildcards

Common Mistakes to Avoid

Testing Your robots.txt

Advanced Tips

Conclusion

Join the conversation