PHP

用PHP封禁特定IP访问, 过滤垃圾SEO蜘蛛

字号+ 编辑: 国内TP粉 修订: 听风就是我 来源: ThinkPHP 2023-09-11 我要说两句(0)

本文以php代码为案例, 解释如何屏蔽垃圾蜘蛛。

一些SEO蜘蛛确实挺烦的, 什么流量都带不来, 还浪费网站资源

function get_ip_data(){    
            $ip=file_get_contents("http://ip.taobao.com/service/getIpInfo.php?ip=".get_client_ip()); 
            $ip = json_decode($ip); 
            if($ip->code){ 
                return false; 
            } 
            $data = (array) $ip->data; 
            if($data['region']=='湖北省' && !isCrawler()){ 
                exit('http://www.wkwkk.com'); 
            } 
} 
         
        function isCrawler() { 
                $spiderSite= array( 
                                "TencentTraveler", 
                                "Baiduspider+", 
                                "BaiduGame", 
                                "Googlebot", 
                                "msnbot", 
                                "Sosospider+", 
                                "Sogou web spider", 
                                "ia_archiver", 
                                "Yahoo! Slurp", 
                                "YoudaoBot", 
                                "Yahoo Slurp", 
                                "MSNBot", 
                                "Java (Often spam bot)", 
                                "BaiDuSpider", 
                                "Voila", 
                                "Yandex bot", 
                                "BSpider", 
                                "twiceler", 
                                "Sogou Spider", 
                                "Speedy Spider", 
                                "Google AdSense", 
                                "Heritrix", 
                                "Python-urllib", 
                                "Alexa (IA Archiver)", 
                                "Ask", 
                                "Exabot", 
                                "Custo", 
                                "OutfoxBot/YodaoBot", 
                                "yacy", 
                                "SurveyBot", 
                                "legs", 
                                "lwp-trivial", 
                                "Nutch", 
                                "StackRambler", 
                                "The web archive (IA Archiver)", 
                                "Perl tool", 
                                "MJ12bot", 
                                "Netcraft", 
                                "MSIECrawler", 
                                "WGet tools", 
                                "larbin", 
                                "Fish search", 
                        ); 
                if(in_array(strtolower($_SERVER['HTTP_USER_AGENT']),$spiderSite)){ 
                    return true; 
                }else{ 
                    return false; 
                } 
        } 
         
        // 取客户端 ip 
        function get_client_ip() 
        { 
            if (isset($_SERVER)){ 
                    if (isset($_SERVER["HTTP_X_FORWARDED_FOR"])){ 
                        $realip = $_SERVER["HTTP_X_FORWARDED_FOR"]; 
                    } else if (isset($_SERVER["HTTP_CLIENT_IP"])) { 
                        $realip = $_SERVER["HTTP_CLIENT_IP"]; 
                    } else { 
                        $realip = $_SERVER["REMOTE_ADDR"]; 
                    } 
            } else { 
                    if (getenv("HTTP_X_FORWARDED_FOR")){ 
                        $realip = getenv("HTTP_X_FORWARDED_FOR"); 
                    } else if (getenv("HTTP_CLIENT_IP")) { 
                        $realip = getenv("HTTP_CLIENT_IP"); 
                    } else { 
                        $realip = getenv("REMOTE_ADDR"); 
                    } 
                } 
            return $realip; 
        }

附赠一堆无聊的垃圾网页蜘蛛关键字福利, 懂的站长伙伴可加入到htaccess或者nginx配置规则中:

MSNbot|Webdup|AcoonBot|SemrushBot|CrawlDaddy|DotBot|Applebot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot|DingTalkBot|DuckDuckBot|WangIDSpider|WBSearchBot|Wotbox|xbfMozilla|Yottaa|YandexBot|Barkrowler|SeznamBot|Jorgee|CCBot|SWEBot|PetalBot|spbot|TurnitinBot-Agent|mail.RU|curl|perl|Python|Wget|Xenu|ZmEu|EasouSpider|YYSpider|python-requests|oBot|MauiBot

阅完此文,您的感想如何?
  • 有用

    1

  • 没用

    0

  • 开心

    0

  • 愤怒

    0

  • 可怜

    0

1.如文章侵犯了您的版权,请发邮件通知本站,该文章将在24小时内删除;
2.本站标注原创的文章,转发时烦请注明来源;
3.交流群: PHP+JS聊天群

相关课文
  • mac开发接入微信公众号接口返回报错 cURL error 56: SSLRead() return error -9806

  • pecl安装程序时报错Array and string offset access syntax with curly braces is no longer supported

  • PHP的换行符是什么

  • 由于商家传入的H5交易参数有误,该笔交易暂时无法完成,请联系商家解决

我要说说
网上宾友点评