#robots.txt file 10/07/03 #Bot Trap, Disallow All Bots User-agent: * Disallow: /spyder-pit/ #Disallow asterias 03/06/04. Streaming media bot, hit me 736 times in one shot User-agent: asterias Disallow: / #Disallow ia_archiver completely, bad bot User-agent: ia_archiver Disallow: / #Disallow scooter, #Disallow TEMP to conserve bandwidth, 10/07/03 User-agent: Scooter Disallow: / #Disallow Inktomi_Slurp, uses to much bandwidth (allow 9/21, 0845) disallow, 10/07/03 User-agent: Slurp Disallow: / #Disallow turnitinbot User-agent: turnitinbot Disallow: / #Disallow WISENutbot (Looksmart), does not respect robots file User-agent: WISENutbot Disallow: / #Disallow grub-client, does not respect robots file User-agent: grub-client Disallow: / #Disallow NPBot User-agent: NPBot #Yahoo! Mobile Web Crawler will not index anything from the site User-agent: YahooSeeker/M1A1-R2D2 Disallow: / #Disallow alamaden, well behaved, respects robtos file/Violated txt User-agent: http://www.almaden.ibm.com/cs/crawler #Disallow turnitin User-agent: TurnitinBot Disallow: / #Disallow Jeeves, uses to much band width, allow 10/07/03 #User-agent: Jeeves #Disallow: / #Disallow AskJeeves User-agent: AskJeeves Disallow: / #Disallow Zyborg, does not respect robots file (IP Ban) User-agent: ZyBorg/1.0 Disallow:/ #Disallow Zyborg, does not respect robots file (IP ban) User-agent: ZyBorg Disallow:/ #Disallow gigabot User-agent: gigabot Disallow:/ #szukacz robot, well behaved #User-agent: szukacz #Disallow: / #Road runner, Seems to be well mannered #User-agent: Road Runner: ImageScape Robot (lim@cs.leidenuniv.nl) #Disallow:/ #Disallow Fast Web Crawler, follows robots.txt, disallow to conserve bandwidth, 10/07/03 User-agent: fast Disallow: / *Photo/Image index bot, seems to respect txt *User-agent: psbot *Disallow: / #Disallow Bai Du Spider. China User-agent: baiduspider Disallow: / #Disallow gazz (overactive) User-agent: gazz Disallow: / #Disalow become bot (shopping bot) User-agent: BecomeBot Disallow: / #Disallow ALL access to Files User-agent: * Disallow: /cgi-bin/ #Disallow: /Phyl&Larry'sPlace Disallow: /VetsDay #Disallow IRL Bot User-agent: IRLbot Disallow: / #Disallow ConveraCrawler/0.7 User-agent: ConveraCrawler/0.7 Dissalow: / #Disallow ParaSite bot, web page not active User-agent: ParaSite Dissalow: / #Disallow pompos bot User-agent:pompos Disallow: / #Disallow User-agent: e-SocietyRobot Disallow: / #Disallow UW bot User-agent: Nutch Disallow: / User-agent: Shim-Crawler Disallow: / User-agent: lwp-trivial/1.41 Disallow: / #Disallow psBot User-agent: psBot Disallow: / User-agent: SBIder Disallow: / #russian bot User-agent: yandex bot Disallow: / User-agent: Gigabot Disallow: / User-agent: Voyager Disallow: / User-agent: BaiDuSpider Disallow: / User-agent: BackRub Disallow: / User-agent: Grub.org Disallow: / User-agent: botRightHere Disallow: / User-agent: larbin Disallow: / User-agent: psbot Disallow: / User-agent: Walhello appie Disallow: / User-agent: Python-urllib Disallow: / User-agent: CherryPicker Disallow: / User-agent: EmailCollector Disallow: / User-agent: WebBandit Disallow: / User-agent: EmailWolf Disallow: / User-agent: CopyRightCheck Disallow: / User-agent: Crescent Disallow: / #Disallow voyger, eats up to much bandwidth User-agent: Voyager/1.0 Disallow: / #Disallow No reason for bot. rss (news) feeds search User-agent: HiddenMarket Disallow: / #Does not obey robots text, banned at firewall 12-8-06 User-agent: Sproose Disallow: / #nutch, ucla User-Agent: complex_network_group Disallow: / User-agent: Exabot Disallow: / User-agent: disco/Nutch-0.9 Disallow: / user-agent: Twiceler-0.9 Disallow: / User-agent: GurujiBot Disallow: / User-agent: envolk Disallow: /