With the start of dylx-infotech  I was keeping eye on my latest visitor and webserver http logs . I was  surprised with googlebot. it was just two days and more and googlebot  has knocked to my blog.
Still I am not able to find how googlebot got my  blog traces and reached to my site. Same is the case with other crawling  bots. Every day I see some new robot hitting my blog. Well but my  favorite is only googlebot. Day after crawling my blog it has already  started showing search result.
I assume googlebot also keeps look on new entry in  DNS servers that’s only possibility to reach to my blog as I don’t have  any backlinks from other site. Here is screenshot of spider, robots  visitors to my blog.
WordPress has very good plug-in for stats and also for googlebot visitor 
I really want to understand how all this robots and  other crawling bots gets site links As per my understanding googlebot  and other bots follows to links so until and unless you get link from  other site to your site called as backlinks, No bot will be able to  reach to your sites. Lets see when I will be able to get answer of this  mystery.
Google webmaster has really good post on All about Googlebot and also check out Googleblog for all latest news about googlebot 
Webmasters need to keep close eye on http server  logs since apart from crawling bots you will see some other bots – Spam  Bots, Junk Bots, Research Bots, Search Engine Bot, and News Bot etc.  Every day new bots are getting added to internet word. Webmasters will  be happy with Search Engine Bots but might not with other bots.
Since bots do eat up lots of hosting bandwidth and  this is not good for those webmasters who have less bandwidth or their  site traffic is huge and bots will add up more load on network and whole  site will be under Error 404 Page Not Found  .If we try to calculate Quality traffic of internet then it will be  only 40-60% remaining percentages accounts for all this bots. Our Main  concern is Spam bot and all those useless bots.
For webmasters ultimate goal is to protect their  site from spam bots. Following are ways to block any bot from accessing  or crawling your site
1) Robots.txt
2) .htaccess
3) IP deny thru Control panel
Will continue in next post…
No comments:
Post a Comment