Skip to Main Content
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Social Media Data and Tools

A guide discussing how to acquire, extract, and use social media data and tools

This page is in the process of being updated to reflect the changes Twitter has made to their API.

Twitter Data

Email datahelp@gmu.edu if you need Twitter data. Begin by reading Twitter's Developer Agreement and Policy.

There are two primary ways to access Twitter data:

1. Work with existing Twitter datasets. 

  • Most Twitter datasets you download from a collection set will be a spreadsheet that contains only the tweet IDs. You will need to "hydrate" these IDs using the Hydrator desktop application to connect them to the original tweet.
    • *Twitter's changes to their API reduces the amount of read-only access which means that Hydrator is no longer a useful application. The application keys have been rescinded by Twitter.*
    • Tutorial: Programming Historian, "Beginner's Guide to Twitter Data"
      • *Twitter's changes to their API means that elements of the lesson will only work for those paying for access to the Twitter API.*
  • Tweet ID Data Sets curated by Documenting the Now
    • The DocNow Catalog is a collectively curated listing of Twitter datasets.
  • TweetSets from George Washington University Libraries
    • TweetSets are Twitter datasets for research and archiving. TweetSets allows researchers to create data queries from existing Twitter datasets created by the TweetSets team.  
    • Tutorial: George Washington University Libraries, "Step-By-Step Guide to TweetSets"

2. Apply for a Twitter developer account to get an API key to extract data directly. This is time-consuming because you need to fill out an application and wait for Twitter to get back to you. It is not guaranteed that Twitter will approve your application. You can pay $100/month for basic access to the Twitter API, which allows you to retrieve up to 10K tweets/month.

  • Resources for extracting Twitter data after paying for access to their API
    • Tweepy Python library
      • An easy-to-use Python library for accessing the Twitter API. Researchers can install Tweepy from the command line using pip or directly from the Github repository.