The robots.txt file is used to communicate to the search engines-Google, Yahoo, Bing, etc. BEFORE they crawl a web page, “Hey, look over here, don’t index me!”. This may be done for a number of reasons, one of which is to prevent duplicate content or pages that really don’t benefit the end users.
The files may be disallowed on any directory level. If you don’t want your website to appear at all, the “disallow: / ” may be used.
There are a few important factors to know:
1)The robots don’t have to follow these rules as this is a non-standardized protocol.
2)If you are playing a trick on the engine- guess what? The robots.text file is public. Not a good idea.
3)If someone links to a robots.txt file then it may be too late to block the engines. In this case you can use the no index/no-follow tag.
For more information regarding the Robots.txt file visit Robots.org