29 Sep 2014

What is robot text file to crawl website

Understand the limits of robots.txt
Before you build your robots.txt, you should know the risks of only using this URL blocking method. At times, you might want to consider other mechanisms to ensure your URLs are not findable on the web.

·         Use the right syntax for each crawler
Although respectable web crawlers follow the directives in a robots.txt file, some crawlers might interpret those directives differently. You should know the proper syntax for addressing different web crawlers as some might not understand certain instructions.
·         Ensure private information is safe
The commands in robots.txt files are not rules that any crawler must follow; instead, it is better to think of these commands as guidelines. Googlebot and other respectable web crawlers obey the instructions in a robots.txt file, but other crawlers might not. Therefore, it is very important to know the consequences of sharing the information that you block in this way. To keep private information secure, we recommend using other blocking methods, such aspassword-protecting private files on your server.

No comments:

Post a Comment