robots.txt Template
robots.txt file is a text file used to instruct web crawlers (like Googlebot) which pages or sections of your website they are allowed to access.
The robots.txt file should be placed in the root directory of your website (e.g., www.example.com/robots.txt)
Allow All Crawlers to Access Everything
User-agent: *
Disallow:
Specify Sitemap (Optional)
The sitemap URL can be provided to assist crawlers
Sitemap: https://www.example.com/sitemap.xml
Blocking a Specific File or Directory
File and Directories can be requested not be crawled and indexed
Disallow: /admin/
Disallow: /private-page.html
Only Blocking a certain bot
Certain bots can be instructed not to crawl the site but all allow other bots
User-agent: YandexBot
Disallow: /
User-agent: *
Disallow:
Crawl Delay
Instruct bots to wait 10 seconds between requests reduce server load
Crawl-delay: 10
Example robots.txt
Example of a complex robots.txt file
# Block all bots from accessing the /admin/ directory
User-agent: *
Disallow: /admin/
# Block Twitter Bot from site
User-agent: Twitterbot
Disallow: /
# Allow all bots to access the entire site except for specific file types
User-agent: *
Disallow: /*.pdf$
Disallow: /*.zip$
Disallow: /*.mp4$
# Block all bots from accessing specific files
User-agent: *
Disallow: /config.php
Disallow: /documents/company_profile.jpg
# Location of the sitemap for SEO
Sitemap: https://www.example.com/sitemap.xml
# Crawl delay for all bots (helpful for reducing server load)
User-agent: *
Crawl-delay: 10
Common Bots
Common bots and user-agents
Search Engine Bots
Google: Googlebot
Bing: bingbot
Yahoo: Slurp
Baidu: Baiduspider
Yandex: YandexBot
DuckDuckGo: DuckDuckBot
Sogou: Sogou web spider
Social Media Bots
Facebook: facebookexternalhit
Twitter: Twitterbot
LinkedIn: LinkedInBot
Monitoring and SEO Bots
AhrefsBot: AhrefsBot
SEMrushBot: SEMrushBot
MozBot: rogerbot
Archive and Data Collection Bots
Wayback Machine: ia_archiver
Common Crawl: CCBot
Other Bots
Apple: Applebot
Supported By Light Weight Hosting