Python IRC Bot Using Twisted (Rollbot)
A simple IRC bot that will show the page titles of links posted
I wrote my first IRC bot recently. It turned out to be a simpler task than I thought it would be.
Here's the back story: At work, our programming team uses an IRC channel for all our internal communication throughout the day. What more would you expect from a bunch of programmers?
Recently I've been taking delight in Rebecca Rolling (worse than being Rick Roll'd) every each Friday. Someone made a comment that it would be nice to have a bot to detect such things. I had been looking for an excuse to try and write an IRC bot, so I took the challenge.
Basically, the majority of the script was created following Eric Florenzano's blog post, so I won't cover all that again. Just go check it out yourself.
Basically, the part I modified was the privmsg function:
1def privmsg(self, user, channel, msg):2 # use regex to find posted URLs3 matches = re.findall(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', msg)4 if matches:5 for url in matches:6 u = urllib.urlopen(url)7 # get the mime type8 urltype = u.headers.gettype()9 #print urltype10 try:11 # use BeautifulSoup to get the page title12 soup = BeautifulSoup.BeautifulSoup(u)13 title = re.sub("\s+", ' ', soup.title.string).strip()14 self.msg(self.factory.channel, "Title: %s" % str(title))15 except (AttributeError, HTMLParseError):16 # if we have an error getting the title, show the mime type17 self.msg(self.factory.channel, \18 "NO TITLE FOUND (%s)" % urltype)
Comments in the code above should explain what is going on. If you want the full source to the Rollbot, check it out on github.