Wednesday, September 28, 2011

Googlebot, Search Engine Bot, Crawling Bots and Robots

With the start of dylx-infotech I was keeping eye on my latest visitor and webserver http logs . I was surprised with googlebot. it was just two days and more and googlebot has knocked to my blog.
Still I am not able to find how googlebot got my blog traces and reached to my site. Same is the case with other crawling bots. Every day I see some new robot hitting my blog. Well but my favorite is only googlebot. Day after crawling my blog it has already started showing search result.
I assume googlebot also keeps look on new entry in DNS servers that’s only possibility to reach to my blog as I don’t have any backlinks from other site. Here is screenshot of spider, robots visitors to my blog.
googlebot,robots,spiders-visitors
WordPress has very good plug-in for stats and also for googlebot visitor
I really want to understand how all this robots and other crawling bots gets site links As per my understanding googlebot and other bots follows to links so until and unless you get link from other site to your site called as backlinks, No bot will be able to reach to your sites. Lets see when I will be able to get answer of this mystery.
Google webmaster has really good post on All about Googlebot and also check out Googleblog for all latest news about googlebot
Webmasters need to keep close eye on http server logs since apart from crawling bots you will see some other bots – Spam Bots, Junk Bots, Research Bots, Search Engine Bot, and News Bot etc. Every day new bots are getting added to internet word. Webmasters will be happy with Search Engine Bots but might not with other bots.
Since bots do eat up lots of hosting bandwidth and this is not good for those webmasters who have less bandwidth or their site traffic is huge and bots will add up more load on network and whole site will be under Error 404 Page Not Found .If we try to calculate Quality traffic of internet then it will be only 40-60% remaining percentages accounts for all this bots. Our Main concern is Spam bot and all those useless bots.
For webmasters ultimate goal is to protect their site from spam bots. Following are ways to block any bot from accessing or crawling your site
1) Robots.txt
2) .htaccess
3) IP deny thru Control panel
Will continue in next post…

No comments:

Post a Comment