Skip to Main Content
George Mason University | University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Social Media Data and Tools

A guide discussing how to acquire, extract, and use social media data and tools

Twitter Data

Email datahelp@gmu.edu if you need Twitter data. Begin by reading Twitter's Developer Agreement and Policy.

There are two primary ways to access Twitter data:

1. Work with existing Twitter datasets. 

  • Most Twitter datasets you download from a collection set will be a spreadsheet that contains only the tweet IDs. You will need to "hydrate" these IDs using the Hydrator desktop application to connect them to the original tweet. *Twitter's changes to their API reduces the amount of read-only access which means that Hydrator is no longer a useful application. The application keys have been rescinded by Twitter.* This means that datasets from Tweet ID Data Sets and TweetSets are not able to be hydrated.
    • Tutorial: Programming Historian, Beginner's Guide to Twitter Data
      • *Twitter's changes to their API means that elements of the lesson will only work for those paying for access to the Twitter API.*
  • Tweet ID Data Sets curated by Documenting the Now
    • The DocNow Catalog is a collectively curated listing of Twitter datasets.
  • TweetSets from George Washington University Libraries
    • TweetSets are Twitter datasets for research and archiving. TweetSets allows researchers to create data queries from existing Twitter datasets created by the TweetSets team.  
    • Tutorial: George Washington University Libraries, Step-By-Step Guide to TweetSets

2. Apply for a Twitter developer account to get an API key to extract data directly. This is time-consuming because you need to fill out an application and wait for Twitter to get back to you. It is not guaranteed that Twitter will approve your application. You can pay $100/month for basic access to the Twitter API, which allows you to retrieve up to 10K tweets/month. Pay for access to the Twitter API at your own risk! There is no guarantee that by paying the fee you will be able to retrieve the data you need.

  • Resources for extracting Twitter data after paying for access to their API
    • Tweepy Python library
      • An easy-to-use Python library for accessing the Twitter API. Researchers can install Tweepy from the command line using pip or directly from the Github repository.