Robots.txt is a text file that contains special instructions for web crawlers. his file tells crawlers which areas are allowed, and which are disallowed.
Robots.txt help to search engines to crawl website’s page and avoid unwanted carwler request.
Example of Robots.txt (https://www.example.com/robots.txt)
User-agent: *
Disallow: /
Allow: /register
Sitemap: https://www.example.com/sitemap.xml
Not all Crawler read understand the robots.txt instructions but most of the crawler understand.
User-agent: This indicates the name of the crawler (crawlers name can be found here) or you can add * instead of crawlers’ name.
Disallow: This indicates to block certain web pages, directories, or files
Allow: This indicates to allow crawling of web pages, directories, or files. This also overwrites disallow.
*: This indicates any characters
$: This indicates end of the line
Sitemap: This holds location or URL of the sitemap. This is optional.
For Better understanding read google robots.txt file https://www.google.com/robots.txt