Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have done a bit of scrapping with ruby mechanize, when we hit limits have circumvented by proxy and tor

google as a search engine crawls most all sites, but offers very few usable stuff to other bots

http://www.google.com/robots.txt

Disallow 247 Allow 41



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: