Mining Twitter Data using Python: Getting Started
Data Mining is a hot topic these days, and Twitter is being used heavily as a data source in various Data Mining applications. In this post I will introduce you to start mining twitter data with Python using the Tweepy module.
( I will not include the scientific module examples here( for mining,analysing ...etc). It's a basic guide to get the Twitter API setup)
Environment Setup
1. Install python ( MacOS comes with python installed)
2. Get a Twitter API key
Go to https://dev.twitter.com/, sign-in to twitter ( create an account if you don't already have one)
Click the profile Icon ( top left) -> My Applications -> Create New App
Provide the necessary data and it will create an application.
Go to the application -> click on API Keys tab
This will show you the necessary keys to authenticate your application using OAuth.
3. Install Tweepy
Tweepy is a python library which supports the Twitter API
Install in Mac:
Here's the github project : https://github.com/tweepy/tweepy
Now you are ready to read some tweets!!
The code to get the twitter stream, ( insert your keys to this file)
This prints the whole twitter stream filtered using the text "nba".
getting user info:
This is a basic example to get you set up. Now you are ready to explore with the Twitter API.
I would recommend using the scikit-learn library for Machine Learning with Python.
http://scikit-learn.org/stable/
Here's the Tweepy Documentation:
http://pythonhosted.org/tweepy/html/
( I will not include the scientific module examples here( for mining,analysing ...etc). It's a basic guide to get the Twitter API setup)
Environment Setup
1. Install python ( MacOS comes with python installed)
2. Get a Twitter API key
Go to https://dev.twitter.com/, sign-in to twitter ( create an account if you don't already have one)
Click the profile Icon ( top left) -> My Applications -> Create New App
Provide the necessary data and it will create an application.
Go to the application -> click on API Keys tab
This will show you the necessary keys to authenticate your application using OAuth.
3. Install Tweepy
Tweepy is a python library which supports the Twitter API
Install in Mac:
pip install tweepyUbuntu:
sudo apt-get install python-tweepy
Here's the github project : https://github.com/tweepy/tweepy
Now you are ready to read some tweets!!
The code to get the twitter stream, ( insert your keys to this file)
#imports from tweepy import Stream from tweepy import OAuthHandler from tweepy.streaming import StreamListener #setting up the keys consumer_key = '' consumer_secret = '' access_token = '' access_secret = '' class TweetListener(StreamListener): # A listener handles tweets are the received from the stream. #This is a basic listener that just prints received tweets to standard output def on_data(self, data): print data return True def on_error(self, status): print status #printing all the tweets to the standard output auth = OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_secret) stream = Stream(auth, TweetListener()) stream.filter(track=['nba'])
This prints the whole twitter stream filtered using the text "nba".
getting user info:
import tweepy auth = OAuthHandler(consumer_key,consumer_secret) api = tweepy.API(auth) auth.set_access_token(access_token, access_secret) twitterStream = Stream(auth,TweetListener()) user = api.get_user('sachithwithana') print user.screen_name
This is a basic example to get you set up. Now you are ready to explore with the Twitter API.
I would recommend using the scikit-learn library for Machine Learning with Python.
http://scikit-learn.org/stable/
Here's the Tweepy Documentation:
http://pythonhosted.org/tweepy/html/
Sachith,
ReplyDeleteI just started harvesting a Twitter stream. Thank you! I am still learning about computation on graphs, and also considering what kind of statistical models might be cool to implement. Will let you know if/when something comes of it.
Again, thank you.
Chris
Thanks mate!
ReplyDeleteYeah try them out and please let me know if can :)
You can use the scikit-learn if you are going to do any Machine Learning stuff :)
Nice guidance, thank you very much for saving lots of time
ReplyDeleteHi, I was wondering if tweepy could be installed on chrome OS through the python app?
ReplyDeleteThank you for this guide, I will use it on my ubuntu box.
Hi,
ReplyDeleteI have downloaded twitter data and saved them as json in a .txt file. Just wondering if there is any online help to understand how to clean it up, convert it to a database and use it in R for data mining. I am new to python.
Magesh