How To: Use robots.txt File Efficiently for SEO

By | 09/02/2012

 

Web robots are also known as WWW robots or Internet bots, which are automatic program used to analysis, crawl and index the website in search engine. Web crawling is the process of create a copy of webpages and index in search engine. This web robots will first checks the website’s robots.txt file, each and every site have its own robots.txt file.

 


User-agent: the robot the following rule applies to

Disallow: the URL you want to block

 

User-agents

Note: / is the directory, which contains all the files of website. (slash) / is also called as root directory.

  • To block the entire site
    Disallow: /

 

  • To block the entire site for all search engine.
    User-agent: * Disallow: /

 

  • To block the entire site for only particular search engine.
    User-agent: Googlebot Disallow: /

 

  • To block a directory and its content

    Disallow: /junk-directory/

 

  • To block a page
    Disallow: /private_file.html

 

  • To remove a specific image from Google search
    User-agent: Googlebot-Image Disallow: /images/tobby.jpg

 

  • To remove all images from Google Image Search
    User-agent: Googlebot-Image Disallow: /

 

All gif format files will be blocked and it will not be indexed in google search.

 

  • Google adsenseworks based on website crawling( ads will display based on site content). This makes the web page from search results, but keep the mediapartners-google to crawl the site for displaying ads.
    User-agent: Mediapartners-Google Allow: /

 

SITEMAP

Sitemap: http://www.tobbynews.com/sitemap1.xml

Sitemap: http://www.tobbynews.com/sitemap2.xml


Post By prasad (110 Posts)

Website: →

Connect

About prasad

Prasad K has written 110 post in this blog.

Category: SEO

Leave a Reply