Hi
Looking at my logs , I see that MSNBOT always does a GET on Robots.txt and goes after the cgi_bin folder, specifically /cgi-bin/os000001.pl." I have not heard a realy compeling reason why the cgi_bin should not be disallowed from the robots. Soneone has mentioned in a thread that alot of the LINKS passes through the cgi_bin folder, BUT SO WHAT?
As far as Robots.txt, my understanding is that it is not rquired. That is what has been indicated in previous threads on this forum. But actually as far as MSN is concerned, it does definetly seem to think that you should have a Robots.txt. If so I was wondering what to put into it
User-agent: *
Disallow:
see http://community.actinic.com/showthr...ght=robots.txt
or
User-agent: *
Disallow: /cgi-bin/
Disallow: /acatalog/*.cat
Disallow: /acatalog/*.fil
Disallow: /*.gif$
Disallow: /*.jpg$
see
http://community.actinic.com/showthr...ght=robots.txt
My inclination just tells me to put in
User-agent: *
Disallow:
as a no robots.txt file is not a good idea for MSNBOT
Also there seems to be some disagreement on this forum whether the cgi_bin folder should be disalloed for robots to crawl it..I have seem someone says that it should be allowed since alot of links go through the cgi_bin fiolder and disllaoing it may not get your pages indexed.
Here is what one SEO Analyst says about MSNBOT and Robots.txt file:
"MSN’s search engine robot is called MSNbot. The MSNbot has quite a voracious appetite for spidering websites. Some webmasters love it and try to feed it as much as possible. Other webmasters don't see any reason to use up bandwidth for a search engine that doesn't bring them traffic. Either way, MSNbot will not spider your website unless you have the robots.txt. Once it finds your robots.txt, it will wander the site, almost timidly at first. Then MSNbot builds up courage and indexes files rapidly. So much so, that use of the crawl-delay directive is recommended with this robot."
Looking at my logs , I see that MSNBOT always does a GET on Robots.txt and goes after the cgi_bin folder, specifically /cgi-bin/os000001.pl." I have not heard a realy compeling reason why the cgi_bin should not be disallowed from the robots. Soneone has mentioned in a thread that alot of the LINKS passes through the cgi_bin folder, BUT SO WHAT?
As far as Robots.txt, my understanding is that it is not rquired. That is what has been indicated in previous threads on this forum. But actually as far as MSN is concerned, it does definetly seem to think that you should have a Robots.txt. If so I was wondering what to put into it
User-agent: *
Disallow:
see http://community.actinic.com/showthr...ght=robots.txt
or
User-agent: *
Disallow: /cgi-bin/
Disallow: /acatalog/*.cat
Disallow: /acatalog/*.fil
Disallow: /*.gif$
Disallow: /*.jpg$
see
http://community.actinic.com/showthr...ght=robots.txt
My inclination just tells me to put in
User-agent: *
Disallow:
as a no robots.txt file is not a good idea for MSNBOT
Also there seems to be some disagreement on this forum whether the cgi_bin folder should be disalloed for robots to crawl it..I have seem someone says that it should be allowed since alot of links go through the cgi_bin fiolder and disllaoing it may not get your pages indexed.
Here is what one SEO Analyst says about MSNBOT and Robots.txt file:
"MSN’s search engine robot is called MSNbot. The MSNbot has quite a voracious appetite for spidering websites. Some webmasters love it and try to feed it as much as possible. Other webmasters don't see any reason to use up bandwidth for a search engine that doesn't bring them traffic. Either way, MSNbot will not spider your website unless you have the robots.txt. Once it finds your robots.txt, it will wander the site, almost timidly at first. Then MSNbot builds up courage and indexes files rapidly. So much so, that use of the crawl-delay directive is recommended with this robot."