Блокировка ботов по User Agent
Для блокировки доступа к сайту поисковых и спам ботов используем директиву User Agent. На сегодняшний день список содержит перечень User Agent ботов, который удалось разыскать.
Чтобы заблокировать надоедливых ботов, размещаем в корне сайта конфигурационный файл .htaccess, прописав в нем следующие директивы:
<IfModule mod_setenvif.c>
# SetEnvIfNoCase User-Agent "^ia_archiver" ban
# SetEnvIfNoCase User-Agent "^WebAlta" ban
# SetEnvIfNoCase User-Agent "^.*grub-client" ban
# SetEnvIfNoCase User-Agent "^.*inktomi\.com" ban
SetEnvIfNoCase User-Agent "^8484 Boston Project" ban
SetEnvIfNoCase User-Agent "^Accelerator" ban
SetEnvIfNoCase User-Agent "^Ants" ban
SetEnvIfNoCase User-Agent "^Ask Jeeves" ban
SetEnvIfNoCase User-Agent "^Atomic_Email_Hunter" ban
SetEnvIfNoCase User-Agent "^atSpider" ban
SetEnvIfNoCase User-Agent "^attach" ban
SetEnvIfNoCase User-Agent "^autoemailspider" ban
SetEnvIfNoCase User-Agent "^BackWeb" ban
SetEnvIfNoCase User-Agent "^Baiduspider" ban
SetEnvIfNoCase User-Agent "^Bandit" ban
SetEnvIfNoCase User-Agent "^BlackWidow" ban
SetEnvIfNoCase User-Agent "^Bot\ mailto:craftbot@yahoo.com" ban
SetEnvIfNoCase User-Agent "^Buddy" ban
SetEnvIfNoCase User-Agent "^bwh3_user_agent" ban
SetEnvIfNoCase User-Agent "^China Local Browse" ban
SetEnvIfNoCase User-Agent "^ChinaClaw" ban
SetEnvIfNoCase User-Agent "^Collector" ban
SetEnvIfNoCase User-Agent "^ContactBot" ban
SetEnvIfNoCase User-Agent "^ContentSmartz" ban
SetEnvIfNoCase User-Agent "^Copier" ban
SetEnvIfNoCase User-Agent "^Custo" ban
SetEnvIfNoCase User-Agent "^DataCha0s" ban
SetEnvIfNoCase User-Agent "^DBrowse" ban
SetEnvIfNoCase User-Agent "^Demo Bot" ban
SetEnvIfNoCase User-Agent "^DISCo" ban
SetEnvIfNoCase User-Agent "^Download Master" ban
SetEnvIfNoCase User-Agent "^Download\ Demon" ban
SetEnvIfNoCase User-Agent "^Downloader" ban
SetEnvIfNoCase User-Agent "^Drip" ban
SetEnvIfNoCase User-Agent "^DSurf15" ban
SetEnvIfNoCase User-Agent "^EBrowse" ban
SetEnvIfNoCase User-Agent "^eCatch" ban
SetEnvIfNoCase User-Agent "^Educate Search VxB" ban
SetEnvIfNoCase User-Agent "^EirGrabber" ban
SetEnvIfNoCase User-Agent "^EmailSiphon" ban
SetEnvIfNoCase User-Agent "^EmailSpider" ban
SetEnvIfNoCase User-Agent "^EmailWolf" ban
SetEnvIfNoCase User-Agent "^ESurf15" ban
SetEnvIfNoCase User-Agent "^Express\ WebPictures" ban
SetEnvIfNoCase User-Agent "^ExtractorPro" ban
SetEnvIfNoCase User-Agent "^EyeNetIE" ban
SetEnvIfNoCase User-Agent "^FileHound" ban
SetEnvIfNoCase User-Agent "^FlashGet" ban
SetEnvIfNoCase User-Agent "^Flexum" ban
SetEnvIfNoCase User-Agent "^Franklin Locator" ban
SetEnvIfNoCase User-Agent "^FSurf15" ban
SetEnvIfNoCase User-Agent "^Full Web Bot" ban
SetEnvIfNoCase User-Agent "^GetRight" ban
SetEnvIfNoCase User-Agent "^Gets" ban
SetEnvIfNoCase User-Agent "^GetWeb!" ban
SetEnvIfNoCase User-Agent "^Gigabot" ban
SetEnvIfNoCase User-Agent "^Go!Zilla" ban
SetEnvIfNoCase User-Agent "^Go-Ahead-Got-It" ban
SetEnvIfNoCase User-Agent "^gotit" ban
SetEnvIfNoCase User-Agent "^GoZilla" ban
SetEnvIfNoCase User-Agent "^Grab.*Site" ban
SetEnvIfNoCase User-Agent "^Grabber" ban
SetEnvIfNoCase User-Agent "^GrabNet" ban
SetEnvIfNoCase User-Agent "^Grafula" ban
SetEnvIfNoCase User-Agent "^gsa-crawler" ban
SetEnvIfNoCase User-Agent "^Guestbook Auto Submitter" ban
SetEnvIfNoCase User-Agent "^Gulliver" ban
SetEnvIfNoCase User-Agent "^HMView" ban
SetEnvIfNoCase User-Agent "^HTTrack" ban
SetEnvIfNoCase User-Agent "^IBrowse" ban
SetEnvIfNoCase User-Agent "^Image\ Stripper" ban
SetEnvIfNoCase User-Agent "^Image\ Sucker" ban
SetEnvIfNoCase User-Agent "^Industry Program" ban
SetEnvIfNoCase User-Agent "^Indy\ Library" ban
SetEnvIfNoCase User-Agent "^InterGET" ban
SetEnvIfNoCase User-Agent "^Internet.*Ninja" ban
SetEnvIfNoCase User-Agent "^Internet\ Ninja" ban
SetEnvIfNoCase User-Agent "^Iria" ban
SetEnvIfNoCase User-Agent "^ISC Systems iRc Search" ban
SetEnvIfNoCase User-Agent "^IUPUI Research Bot" ban
SetEnvIfNoCase User-Agent "^JetCar" ban
SetEnvIfNoCase User-Agent "^jetcar" ban
SetEnvIfNoCase User-Agent "^JOC" ban
SetEnvIfNoCase User-Agent "^JOC\ Web\ Spider" ban
SetEnvIfNoCase User-Agent "^JustView" ban
SetEnvIfNoCase User-Agent "^larbin" ban
SetEnvIfNoCase User-Agent "^LARBIN-EXPERIMENTAL" ban
SetEnvIfNoCase User-Agent "^leech" ban
SetEnvIfNoCase User-Agent "^LeechFTP" ban
SetEnvIfNoCase User-Agent "^LetsCrawl.com" ban
SetEnvIfNoCase User-Agent "^lftp" ban
SetEnvIfNoCase User-Agent "^libwww-perl" ban
SetEnvIfNoCase User-Agent "^likse" ban
SetEnvIfNoCase User-Agent "^Lincoln State Web Browser" ban
SetEnvIfNoCase User-Agent "^liveinternet" ban
SetEnvIfNoCase User-Agent "^LMQueueBot" ban
SetEnvIfNoCase User-Agent "^LWP::Simple" ban
SetEnvIfNoCase User-Agent "^Mac Finder" ban
SetEnvIfNoCase User-Agent "^Magnet" ban
SetEnvIfNoCase User-Agent "^Mag-Net" ban
SetEnvIfNoCase User-Agent "^Mass\ Downloader" ban
SetEnvIfNoCase User-Agent "^Memo" ban
SetEnvIfNoCase User-Agent "^MFC Foundation Class Library" ban
SetEnvIfNoCase User-Agent "^Microsoft URL Control" ban
SetEnvIfNoCase User-Agent "^MIDown.*tool" ban
SetEnvIfNoCase User-Agent "^MIDown\ tool" ban
SetEnvIfNoCase User-Agent "^Mirror" ban
SetEnvIfNoCase User-Agent "^Missauga Locate" ban
SetEnvIfNoCase User-Agent "^Missigua Locator" ban
SetEnvIfNoCase User-Agent "^Missouri College Browse" ban
SetEnvIfNoCase User-Agent "^Mister.*PiX" ban
SetEnvIfNoCase User-Agent "^Mister\ PiX" ban
SetEnvIfNoCase User-Agent "^Mizzu Labs" ban
SetEnvIfNoCase User-Agent "^MJ12bot" ban
SetEnvIfNoCase User-Agent "^Mo College" ban
SetEnvIfNoCase User-Agent "^Mozilla/2.0 (compatible; Ask Jeeves/Teoma)" ban
SetEnvIfNoCase User-Agent "^MVAClient" ban
SetEnvIfNoCase User-Agent "^NameOfAgent (CMS Spider)" ban
SetEnvIfNoCase User-Agent "^NASA Search" ban
SetEnvIfNoCase User-Agent "^Navroad" ban
SetEnvIfNoCase User-Agent "^NearSite" ban
SetEnvIfNoCase User-Agent "^Net.*Reaper" ban
SetEnvIfNoCase User-Agent "^Net.*Vampire" ban
SetEnvIfNoCase User-Agent "^Net\ Vampire" ban
SetEnvIfNoCase User-Agent "^NetAnts" ban
SetEnvIfNoCase User-Agent "^NetSpider" ban
SetEnvIfNoCase User-Agent "^NetZIP" ban
SetEnvIfNoCase User-Agent "^Ninja" ban
SetEnvIfNoCase User-Agent "^Nsauditor" ban
SetEnvIfNoCase User-Agent "^Octopus" ban
SetEnvIfNoCase User-Agent "^Offline" ban
SetEnvIfNoCase User-Agent "^Offline.*Explorer" ban
SetEnvIfNoCase User-Agent "^Offline\ Explorer" ban
SetEnvIfNoCase User-Agent "^Offline\ Navigator" ban
SetEnvIfNoCase User-Agent "^Page.*Saver" ban
SetEnvIfNoCase User-Agent "^PageGrabber" ban
SetEnvIfNoCase User-Agent "^Papa.*Foto" ban
SetEnvIfNoCase User-Agent "^Papa\ Foto" ban
SetEnvIfNoCase User-Agent "^pavuk" ban
SetEnvIfNoCase User-Agent "^PBrowse" ban
SetEnvIfNoCase User-Agent "^pcBrowser" ban
SetEnvIfNoCase User-Agent "^PEval" ban
SetEnvIfNoCase User-Agent "^Pita" ban
SetEnvIfNoCase User-Agent "^Pockey" ban
SetEnvIfNoCase User-Agent "^Poirot" ban
SetEnvIfNoCase User-Agent "^Port Huron Labs" ban
SetEnvIfNoCase User-Agent "^Production Bot" ban
SetEnvIfNoCase User-Agent "^Program Shareware" ban
SetEnvIfNoCase User-Agent "^psbot" ban
SetEnvIfNoCase User-Agent "^PSurf15" ban
SetEnvIfNoCase User-Agent "^psycheclone" ban
SetEnvIfNoCase User-Agent "^Pump" ban
SetEnvIfNoCase User-Agent "^RealDownload" ban
SetEnvIfNoCase User-Agent "^Reaper" ban
SetEnvIfNoCase User-Agent "^Recorder" ban
SetEnvIfNoCase User-Agent "^ReGet" ban
SetEnvIfNoCase User-Agent "^RSurf15" ban
SetEnvIfNoCase User-Agent "^Scooter" ban
SetEnvIfNoCase User-Agent "^searchbot admin@google.com" ban
SetEnvIfNoCase User-Agent "^SEO search Crawler" ban
SetEnvIfNoCase User-Agent "^SEOsearch" ban
SetEnvIfNoCase User-Agent "^ShablastBot" ban
SetEnvIfNoCase User-Agent "^Siphon" ban
SetEnvIfNoCase User-Agent "^SiteSnagger" ban
SetEnvIfNoCase User-Agent "^SmartDownload" ban
SetEnvIfNoCase User-Agent "^Snagger" ban
SetEnvIfNoCase User-Agent "^Snake" ban
SetEnvIfNoCase User-Agent "^snap.com beta crawler" ban
SetEnvIfNoCase User-Agent "^Snapbot" ban
SetEnvIfNoCase User-Agent "^sogou develop spider" ban
SetEnvIfNoCase User-Agent "^Sogou Orion spider" ban
SetEnvIfNoCase User-Agent "^sogou spider" ban
SetEnvIfNoCase User-Agent "^Sogou web spider" ban
SetEnvIfNoCase User-Agent "^sohu agent" ban
SetEnvIfNoCase User-Agent "^SpaceBison" ban
SetEnvIfNoCase User-Agent "^SSurf15" ban
SetEnvIfNoCase User-Agent "^Stripper" ban
SetEnvIfNoCase User-Agent "^Sucker" ban
SetEnvIfNoCase User-Agent "^SuperBot" ban
SetEnvIfNoCase User-Agent "^SuperHTTP" ban
SetEnvIfNoCase User-Agent "^Surfbot" ban
SetEnvIfNoCase User-Agent "^tAkeOut" ban
SetEnvIfNoCase User-Agent "^Teleport.*Pro" ban
SetEnvIfNoCase User-Agent "^Teleport\ Pro" ban
SetEnvIfNoCase User-Agent "^Triton" ban
SetEnvIfNoCase User-Agent "^TSurf15" ban
SetEnvIfNoCase User-Agent "^Twiceler" ban
SetEnvIfNoCase User-Agent "^Under the Rainbow" ban
SetEnvIfNoCase User-Agent "^Vacuum" ban
SetEnvIfNoCase User-Agent "^VadixBot" ban
SetEnvIfNoCase User-Agent "^VoidEYE" ban
SetEnvIfNoCase User-Agent "^voyager" ban
SetEnvIfNoCase User-Agent "^W3 SiteSearch Crawler" ban
SetEnvIfNoCase User-Agent "^W3C_*Validator" ban
SetEnvIfNoCase User-Agent "^W3C-checklink" ban
SetEnvIfNoCase User-Agent "^Weazel" ban
SetEnvIfNoCase User-Agent "^Web.*Image.*Collector" ban
SetEnvIfNoCase User-Agent "^Web.*Spy" ban
SetEnvIfNoCase User-Agent "^Web.*Sucker" ban
SetEnvIfNoCase User-Agent "^Web\ Image\ Collector" ban
SetEnvIfNoCase User-Agent "^Web\ Sucker" ban
SetEnvIfNoCase User-Agent "^WebAuto" ban
SetEnvIfNoCase User-Agent "^WebCapture" ban
SetEnvIfNoCase User-Agent "^WebCopier" ban
SetEnvIfNoCase User-Agent "^WebFetch" ban
SetEnvIfNoCase User-Agent "^WebGo\ IS" ban
SetEnvIfNoCase User-Agent "^WebLeacher" ban
SetEnvIfNoCase User-Agent "^WebMirror" ban
SetEnvIfNoCase User-Agent "^WebReaper" ban
SetEnvIfNoCase User-Agent "^WebRecorder" ban
SetEnvIfNoCase User-Agent "^WebSauger" ban
SetEnvIfNoCase User-Agent "^WebSite.*Extractor" ban
SetEnvIfNoCase User-Agent "^Website.*Quester" ban
SetEnvIfNoCase User-Agent "^Website\ eXtractor" ban
SetEnvIfNoCase User-Agent "^Website\ Quester" ban
SetEnvIfNoCase User-Agent "^WebSpy" ban
SetEnvIfNoCase User-Agent "^Webster" ban
SetEnvIfNoCase User-Agent "^WebStripper" ban
SetEnvIfNoCase User-Agent "^WebVulnCrawl.unknown" ban
SetEnvIfNoCase User-Agent "^WebWhacker" ban
SetEnvIfNoCase User-Agent "^WebZIP" ban
SetEnvIfNoCase User-Agent "^Wells Search" ban
SetEnvIfNoCase User-Agent "^WEP Search" ban
SetEnvIfNoCase User-Agent "^Wget" ban
SetEnvIfNoCase User-Agent "^Whacker" ban
SetEnvIfNoCase User-Agent "^Widow" ban
SetEnvIfNoCase User-Agent "^www\.asona\.org" ban
SetEnvIfNoCase User-Agent "^WWWOFFLE" ban
SetEnvIfNoCase User-Agent "^Xaldon\ WebSpider" ban
SetEnvIfNoCase User-Agent "^Yanga" ban
SetEnvIfNoCase User-Agent "^Zeus" ban
</IfModule>
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=ban
</Limit>
# SetEnvIfNoCase User-Agent "^.*inktomi\.com" ban – это может быть поисковый бот Yahoo, определяйтесь сами, нужно ли его блокировать. Для блокировки раскомментируйте строку с правилом.