Yan Pritzker photographer, entrepreneur, software engineer, musician, skier

Blog :: Git Workflows Book :: Dotfiles :: Photography :: About Me

TwitterCounter for @skwp

Get the news feed
Get updates by email
Follow me on twitter

hello, i'm yan

I am a photographer, entrepreneur, software engineer, guitarist, climber, and telemark skier

This blog is about startups, blogging, Ruby On Rails, virtualization and cloud computing, photography, customer service, marketing, ux and design, git, and lots more.

Enterprise intelligence with prediction markets

Find out what your team, colleagues and partners really know about the future — and leverage their knowledge to improve business decisions.

I'm the founder of Planypus, the place to share your plans!

Archives

Contact

Reach me at yan at pritzker.ws

webcrawler bot detection

Posted 13 March 2008 @ 7pm | Tagged rails, ruby


Submit to HN
  def self.bot_agent_list
    [ "panscient", "larbin", "dummy", "Teoma", "alexa",
      "froogle", "inktomi", "looksmart", "URL_Spider_SQL",
      "Firefly", "NationalDirectory", "Ask Jeeves", "TECNOSEEK",
      "InfoSeek", "WebFindBot", "crawler", "girafobot", "Scooter",
      "Baidu", "bot", "Google", "SiteUptime", "Slurp",
      "WordPress", "ZIBB", "ZyBorg", "msnbot", "check_http",
      "libwww-perl", "lwp-trivial", "wget", "curl", "SimplePie",
      "Python", "Feed", "HTTPClient", "Tumblr", "Spider", "sanszbot"]
  end

Full source at http://pastie.org/191922


2 Comments

Posted by
igor
16 March 2008 @ 5am

robots don’t smell.


Posted by
yan
17 March 2008 @ 3pm

but they can sure cause a stink

(cymbal crash)


Leave a Comment