robots.txt problema msnbot
http://forums.digitalpoint.com/showthread.php?t=279230
citez de pe digital point (nu mi-a raspuns nimeni momentan acolo)
Cod:
here's my problem:
http://www.itpromo.net/robots.txt
-----snip------
User-agent: *
[...]
Disallow: /*pdf$
Disallow: /*xls$
Disallow: /*html$
Disallow: /*zip$
Disallow: /*RON
Disallow: /*EUR
Disallow: /*USD
Disallow: /*NONE
Disallow: /*ASC
Disallow: /*DESC
-----snip------
this should block all the urls containing the words after *, and the ones ending with them ($)
Googlebot and Slurp recognize this, but Teoma and MSNbot don't:
-----log snip-----
"msnbot/1.0 (+http://search.msn.com/msnbot.htm)" www.itpromo.net GET /memory/a_data/1/NONE/DESC/NONE HTTP/1.0 41345 200 0 [26/Mar/2007:14:03:14 +0300]
"msnbot/1.0 (+http://search.msn.com/msnbot.htm)" www.itpromo.net GET /memory/a_data/1/xls HTTP/1.0 13207 200 0 [26/Mar/2007:14:03:35 +0300]
-----log snip-----
what are my options to block all the bots from reaching this pages, they make a lot of traffic and i want this sections to be ignored, also i have rel="nofollow" to all the internal links pointing to this kind of URLs
i've written the detailed problem on my blog also: http://www.ghita.ro/article/23/web_robots_and_dynamic_content_issues.html (scroll down to Problems).
thanks!
ori am gresit robots.txt-ul ori msnbot inca nu le are cu standardele