Do you think this robot.txt is of value?

The place to talk about things that are related to CMS Made simple, but don't fit anywhere else.
Post Reply
wmdvanzyl
Forum Members
Forum Members
Posts: 214
Joined: Fri May 06, 2011 12:48 pm

Do you think this robot.txt is of value?

Post by wmdvanzyl »

I found this robots.txt along with a free template and i was wondering whether you think it has value?

Is it important to block these crawlers? What do you ahve in your robots.txt?

Code: Select all

# ROBOTS.TXT STOPS BAD BOTS FROM CRAWLING YOUR WEB PAGES
# PLEASE UPLOAD THIS FILE TO THE SAME FOLDER OF index.html

User-agent: aipbot
Disallow: /

User-agent: ia_archiver
Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: Alexibot 
Disallow: /

User-agent: Aqua_Products 
Disallow: /

User-agent: asterias 
Disallow: /

User-agent: b2w/0.1 
Disallow: /

User-agent: BackDoorBot/1.0 
Disallow: /

User-agent: becomebot
Disallow: /

User-agent: BlowFish/1.0 
Disallow: /

User-agent: Bookmark search tool 
Disallow: /

User-agent: BotALot 
Disallow: /

User-agent: BotRightHere 
Disallow: /

User-agent: BuiltBotTough 
Disallow: /

User-agent: Bullseye/1.0 
Disallow: /

User-agent: BunnySlippers 
Disallow: /

User-agent: CheeseBot 
Disallow: /

User-agent: CherryPicker 
Disallow: /

User-agent: CherryPickerElite/1.0 
Disallow: /

User-agent: CherryPickerSE/1.0 
Disallow: /

User-agent: Copernic 
Disallow: /

User-agent: CopyRightCheck 
Disallow: /

User-agent: cosmos 
Disallow: /

User-agent: Crescent 
Disallow: /

User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 
Disallow: /

User-agent: DittoSpyder 
Disallow: /

User-agent: dotbot
Disallow: /

User-agent: EmailCollector 
Disallow: /

User-agent: EmailSiphon 
Disallow: /

User-agent: EmailWolf 
Disallow: /

User-agent: EroCrawler 
Disallow: /

User-agent: exabot
Disallow: /

User-agent: ExtractorPro 
Disallow: /

User-agent: FairAd Client 
Disallow: /

User-agent: Fasterfox
Disallow: /

User-agent: Flaming AttackBot 
Disallow: /

User-agent: Foobot 
Disallow: /

User-agent: gigabot
Disallow: /

User-agent: Gaisbot 
Disallow: /

User-agent: GetRight/4.2 
Disallow: /

User-agent: Harvest/1.5 
Disallow: /

User-agent: hloader 
Disallow: /

User-agent: httplib 
Disallow: /

User-agent: HTTrack 3.0 
Disallow: /

User-agent: humanlinks 
Disallow: /

User-agent: ia_archiver
Disallow: /

User-agent: IconSurf
Disallow: /
Disallow: /favicon.ico

User-agent: InfoNaviRobot 
Disallow: /

User-agent: Iron33/1.0.2 
Disallow: /

User-agent: JennyBot 
Disallow: /

User-agent: Kenjin Spider 
Disallow: /

User-agent: Keyword Density/0.9 
Disallow: /

User-agent: larbin 
Disallow: /

User-agent: LexiBot 
Disallow: /

User-agent: libWeb/clsHTTP 
Disallow: /

User-agent: LinkextractorPro 
Disallow: /

User-agent: LinkScan/8.1a Unix 
Disallow: /

User-agent: LinkWalker 
Disallow: /

User-agent: LNSpiderguy 
Disallow: /

User-agent: lwp-trivial 
Disallow: /

User-agent: lwp-trivial/1.34 
Disallow: /

User-agent: Mata Hari 
Disallow: /

User-agent: Microsoft URL Control 
Disallow: /

User-agent: Microsoft URL Control - 5.01.4511 
Disallow: /

User-agent: Microsoft URL Control - 6.00.8169 
Disallow: /

User-agent: MJ12bot
Disallow: /

User-agent: MIIxpc 
Disallow: /

User-agent: MIIxpc/4.2 
Disallow: /

User-agent: Mister PiX 
Disallow: /

User-agent: moget 
Disallow: /

User-agent: moget/2.1 
Disallow: /

User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95) 
Disallow: /

User-agent: MSIECrawler 
Disallow: /

User-agent: NetAnts 
Disallow: /

User-agent: NICErsPRO 
Disallow: /

User-agent: Offline Explorer 
Disallow: /

User-agent: Openbot 
Disallow: /

User-agent: Openfind 
Disallow: /

User-agent: Openfind data gatherer 
Disallow: /

User-agent: Oracle Ultra Search 
Disallow: /

User-agent: PerMan 
Disallow: /

User-agent: ProPowerBot/2.14 
Disallow: /

User-agent: ProWebWalker 
Disallow: /

User-agent: psbot 
Disallow: /

User-agent: Python-urllib 
Disallow: /

User-agent: QueryN Metasearch 
Disallow: /

User-agent: Radiation Retriever 1.1 
Disallow: /

User-agent: RepoMonkey 
Disallow: /

User-agent: RepoMonkey Bait & Tackle/v1.01 
Disallow: /

User-agent: RMA 
Disallow: /

User-agent: rogerbot
disallow: /

User-agent: searchpreview 
Disallow: /

User-agent: SiteSnagger 
Disallow: /

User-agent: SpankBot 
Disallow: /

User-agent: spanner 
Disallow: /

User-agent: SurveyBot
Disallow: /

User-agent: suzuran 
Disallow: /

User-agent: Szukacz/1.4 
Disallow: /

User-agent: Teleport 
Disallow: /

User-agent: TeleportPro 
Disallow: /

User-agent: Telesoft 
Disallow: /

User-agent: The Intraformant 
Disallow: /

User-agent: TheNomad 
Disallow: /

User-agent: TightTwatBot 
Disallow: /

User-agent: toCrawl/UrlDispatcher 
Disallow: /

User-agent: True_Robot 
Disallow: /

User-agent: True_Robot/1.0 
Disallow: /

User-agent: turingos 
Disallow: /

User-agent: TurnitinBot 
Disallow: /

User-agent: TurnitinBot/1.5 
Disallow: /

User-agent: URL Control 
Disallow: /

User-agent: URL_Spider_Pro 
Disallow: /

User-agent: URLy Warning 
Disallow: /

User-agent: VCI 
Disallow: /

User-agent: VCI WebViewer VCI WebViewer Win32 
Disallow: /

User-agent: Web Image Collector 
Disallow: /

User-agent: WebAuto 
Disallow: /

User-agent: WebBandit 
Disallow: /

User-agent: WebBandit/3.50 
Disallow: /

User-agent: WebCapture 2.0 
Disallow: /

User-agent: WebCopier 
Disallow: /

User-agent: WebCopier v.2.2 
Disallow: /

User-agent: WebCopier v3.2a 
Disallow: /

User-agent: WebEnhancer 
Disallow: /

User-agent: WebSauger 
Disallow: /

User-agent: Website Quester 
Disallow: /

User-agent: Webster Pro 
Disallow: /

User-agent: WebStripper 
Disallow: /

User-agent: WebZip 
Disallow: /

User-agent: WebZip 
Disallow: /

User-agent: WebZip/4.0 
Disallow: /

User-agent: WebZIP/4.21 
Disallow: /

User-agent: WebZIP/5.0 
Disallow: /

User-agent: Wget 
Disallow: /

User-agent: wget 
Disallow: /

User-agent: Wget/1.5.3 
Disallow: /

User-agent: Wget/1.6 
Disallow: /

User-agent: WWW-Collector-E 
Disallow: /

User-agent: Xenu's 
Disallow: /

User-agent: Xenu's Link Sleuth 1.1c 
Disallow: /

User-agent: Zeus 
Disallow: /

User-agent: Zeus 32297 Webster Pro V2.9 Win32 
Disallow: /

User-agent: Zeus Link Scout 
Disallow: /

User-agent: *
Disallow: /js
Disallow: *.js
User avatar
Rolf
Power Poster
Power Poster
Posts: 7825
Joined: Wed Apr 23, 2008 7:53 am
Contact:

Re: Do you think this robot.txt is of value?

Post by Rolf »

Just wondering, if it is a *bad* bot why would it listen to robots.txt...
- + - + - + - + - + - + -
LATEST TUTORIAL AT CMS CAN BE SIMPLE:
Migrating Company Directory module to LISE
- + - + - + - + - + - + -
Image
User avatar
Jo Morg
Dev Team Member
Dev Team Member
Posts: 1974
Joined: Mon Jan 29, 2007 4:47 pm

Re: Do you think this robot.txt is of value?

Post by Jo Morg »

Rolf wrote:Just wondering, if it is a *bad* bot why would it listen to robots.txt...
Indeed.
Unless you are sure that those bots do respect the robots.txt directives, the list may very well be worthless.
"There are 10 types of people in this world, those who understand binary... and those who don't."
* by the way: English is NOT my native language (sorry for any mistakes...).
Code of Condut | CMSMS Docs | Help Support CMSMS
My developer Page on the Forge
GeekMoot 2015 in Ghent, Belgium: I was there!
GeekMoot 2016 in Leicester, UK: I was there!
DevMoot 2023 in Cynwyd, Wales: I was there!
wmdvanzyl
Forum Members
Forum Members
Posts: 214
Joined: Fri May 06, 2011 12:48 pm

Re: Do you think this robot.txt is of value?

Post by wmdvanzyl »

Thanks for the feedback. I don't attach much value to such a list myself and i am glad to hear that other share my view.
Post Reply

Return to “The Lounge”