Mining Twitter Data using Python: Getting Started
Data Mining is a hot topic these days, and Twitter is being used heavily as a data source in various Data Mining applications. In this post I will introduce you to start mining twitter data with Python using the Tweepy module.
( I will not include the scientific module examples here( for mining,analysing ...etc). It's a basic guide to get the Twitter API setup)
Environment Setup
1. Install python ( MacOS comes with python installed)
2. Get a Twitter API key
Go to https://dev.twitter.com/, sign-in to twitter ( create an account if you don't already have one)
Click the profile Icon ( top left) -> My Applications -> Create New App
Provide the necessary data and it will create an application.
Go to the application -> click on API Keys tab
This will show you the necessary keys to authenticate your application using OAuth.
3. Install Tweepy
Tweepy is a python library which supports the Twitter API
Install in Mac:
Here's the github project : https://github.com/tweepy/tweepy
Now you are ready to read some tweets!!
The code to get the twitter stream, ( insert your keys to this file)
This prints the whole twitter stream filtered using the text "nba".
getting user info:
This is a basic example to get you set up. Now you are ready to explore with the Twitter API.
I would recommend using the scikit-learn library for Machine Learning with Python.
http://scikit-learn.org/stable/
Here's the Tweepy Documentation:
http://pythonhosted.org/tweepy/html/
( I will not include the scientific module examples here( for mining,analysing ...etc). It's a basic guide to get the Twitter API setup)
Environment Setup
1. Install python ( MacOS comes with python installed)
2. Get a Twitter API key
Go to https://dev.twitter.com/, sign-in to twitter ( create an account if you don't already have one)
Click the profile Icon ( top left) -> My Applications -> Create New App
Provide the necessary data and it will create an application.
Go to the application -> click on API Keys tab
This will show you the necessary keys to authenticate your application using OAuth.
3. Install Tweepy
Tweepy is a python library which supports the Twitter API
Install in Mac:
pip install tweepyUbuntu:
sudo apt-get install python-tweepy
Here's the github project : https://github.com/tweepy/tweepy
Now you are ready to read some tweets!!
The code to get the twitter stream, ( insert your keys to this file)
#imports
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
#setting up the keys
consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''
class TweetListener(StreamListener):
# A listener handles tweets are the received from the stream.
#This is a basic listener that just prints received tweets to standard output
def on_data(self, data):
print data
return True
def on_error(self, status):
print status
#printing all the tweets to the standard output
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
stream = Stream(auth, TweetListener())
stream.filter(track=['nba'])
This prints the whole twitter stream filtered using the text "nba".
getting user info:
import tweepy
auth = OAuthHandler(consumer_key,consumer_secret)
api = tweepy.API(auth)
auth.set_access_token(access_token, access_secret)
twitterStream = Stream(auth,TweetListener())
user = api.get_user('sachithwithana')
print user.screen_name
This is a basic example to get you set up. Now you are ready to explore with the Twitter API.
I would recommend using the scikit-learn library for Machine Learning with Python.
http://scikit-learn.org/stable/
Here's the Tweepy Documentation:
http://pythonhosted.org/tweepy/html/
Sachith,
ReplyDeleteI just started harvesting a Twitter stream. Thank you! I am still learning about computation on graphs, and also considering what kind of statistical models might be cool to implement. Will let you know if/when something comes of it.
Again, thank you.
Chris
Thanks mate!
ReplyDeleteYeah try them out and please let me know if can :)
You can use the scikit-learn if you are going to do any Machine Learning stuff :)
Nice guidance, thank you very much for saving lots of time
ReplyDeleteHi, I was wondering if tweepy could be installed on chrome OS through the python app?
ReplyDeleteThank you for this guide, I will use it on my ubuntu box.
Hi,
ReplyDeleteI have downloaded twitter data and saved them as json in a .txt file. Just wondering if there is any online help to understand how to clean it up, convert it to a database and use it in R for data mining. I am new to python.
Magesh