Robots.txt File vs Meta-Robots Tags
A student recently asked me “Which is better, the Robots.txt or the Meta Robots Tag?”
This depends heavily on what you are trying to accomplish, but in general – for their main purposes of instructing search engines whether or not to crawl and/or index a certain page Meta-Robots should be your primary method, unless you are trying to secure confidential information. Remember, Robots.txt is public so anyone can access this file and see what directories you are attempting to block access to. However, the question of robots.txt or meta robots” is not an either/or. You can, and should, use both methods where applicable.
Both the robots.tx file and the meta-robots tag work to tell search engines which pages to crawl and index. There are 3 primary ways of relaying this information to search engines:
This allows you to define specific URLs that you do not wish to be crawled. However, the URL of the page itself may still be indexed in search results. This URL won’t display a content snippet like your normal pages do because the content of the page will not be crawled.
Meta-Robots NoIndex command. (Recommended)
This tells search engines they are welcome to crawl the page, but may not include the URL or content of the page in search results. This is the recommended steps if you do not want anyone to find the page through search engines.
No-Follow on links
This tells search engines not to follow a link to a page when they are crawling that link on your site. Due to this, your page can still be found and accessed other ways. This is not a recommended method of blocking pages from viewers or robots.