robots dot txt

bamr87 • December 04, 2023 • 1 min read • Updated: May 16, 2024

A robots.txt file is used to instruct web robots (typically search engine robots) which pages on your website to crawl and which not to. Here’s an example of a robots.txt file that you might use for a Jekyll site:

User-agent: *
Disallow: /secret/
Disallow: /private/
Disallow: /tmp/
Sitemap: https://it-journey.dev/sitemap.xml

In this example:

User-agent: * means that the instructions apply to all web robots.
Disallow: /secret/ tells robots not to crawl pages under the /secret/ directory.
Disallow: /private/ and Disallow: /tmp/ do the same for these directories.
Sitemap: https://www.yoursite.com/sitemap.xml provides the location of your site’s sitemap, which is helpful for search engines to find and index your content.

Remember to replace https://www.yoursite.com/sitemap.xml with the actual URL of your sitemap. Also, the Disallow entries should be adjusted based on the specific directories or pages you want to keep private or don’t want to be indexed by search engines.

Layout	`article`
Collection	`posts`
Path	`_posts/data-analytics/2023-12-04-robots-txt-jekyll.md`
URL	`/posts/robots-txt-jekyll/`
Date	`2023-12-04`

Settings

Search

Appearance

About

Page Location

Source Code

Page Info

Theme Skin

SVG Backgrounds

Layer Opacity

Table of Contents

robots dot txt