Twitter, How To Detect Twitter Bots using Itsabot.com

2009 January 6

So yesterday I got a tweet from a developer @PaulKinlan.  At the end of last year Paul released Twollo. A Twitter app that automates finding new friends on Twitter. Now Paul has started the New Year with an offering called itsabot.

itsabot is an exciting development that can find any Twitter bots following you. Now more than ever all tweeps are looking to avoid the dreaded Twitterr spam and account hi-jacking we have recently suffered on Twitter, itsabot is a great step towards combating the bot problem on Twitter.

Itsabot can be used to clean up any bots already following you, but most interestingly can be used by any Twitter app to prevent the spreading of bots from Tweeps using auto follow functions.

Anyway thats enough of my excited ramblings, here is the why and how story straight from the developers mouth.

I started developing www.itsabot.com because I needed some way of detecting a given user was a bot or not on www.twollo.com, many users have been complaining that it was autofollowing bots which in turn devalued twollo’s service. I couldn’t find a service that had a definative list of any bots available so I decided to create one.

There are too many users on twitter to be able to classify every user by hand, however a lot of bots, spammers, auto posters display many similar traits that can be used to help automate the process of identifying bots.  I initially identified known “good” twitterers and known “bots” that auto post on to twitter, took some parameters out of the data and trained a neural network to determine if a user is a bot. The first version I created had a simple threshold system based on a simple ratio however it had lots of errors in it and a neural network produced better results.  It must be noted that the neural network gets it wrong sometimes and is unlikely to ever get it right, so I have a system where by if reported to me I can permenatly override the results of the NN.

Once I had the NN developed, I quickly knocked up the site and the automatic Twitter identification system and integrated the NN in to the results.

There weren’t really many problems that I had when getting the system into it’s current state, I had a clear idea about what I wanted from the start and I knew that it was highly unlikely that what ever algorithm I created would work 100% of the time so that I am not disappointed when it gets it wrong as it is very easy to override the result.

I needed the system to be completely separate from twollo, so in the very first versions of the code I integrated json and csv output so that I could consume it myself, the obvious extension to this would to allow other developers to access the service.  Twitter has a core basic service that they have created, and it is my opinion that they don’t have time or inclination to make all these extra services that are useful to enhance users experience, and whilst I can create a service that detects bots, so can a lot of people; the way that I am attempting to give my service value and to slow down the development of other similar tools is by providing the API to openly access my data.

Itsabot.com is entirly hosted (apart from a polling service that I host) on the Google App Engine and is written entirly in python. I wrote it in roughly two days, but it took longer to train the NN to a point where I was happy with the results.

Now what do you think?

Any other ideas that would help to keep Twitter clean of spammers and useless bots?

Let us know below, or join us on Twitter @twtr_gator

Technorati Tags: , , ,

3 Comments leave one →
2009 January 6

One thought I have had since playing around with itsabot is that the addition of a ‘report spam to twitter’ button/link on the itsabot site for each individual bot found.

2009 January 6

Or maybe a ‘tweet this suspected bot’ link that posts a name and shame style tweet. This would make other users aware of named bots and get itsabot in the tweet stream and retweeted in a helpful way.

2009 May 23

i gust want to say some thing “great job”

Update your Twitter randomly according to your intrest Or, from Rss Feed Or, from your own tweet message list Or, Any combination of the above three http://feedmytwitter.com

Leave A Comment

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS