Skip to Main Content
George Mason University | University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Social Media Data

How to acquire, extract, and use social media data

Social Media Data

Social media data is often tricky to access and data availability depends on the time period and platform(s) you need to research. 

Before conducting research using data from social media sources, you must read the platform’s terms of service, which is a pseudo-legal agreement that determines how data can be acquired, used, stored, and shared. Many platforms have restrictions on the automated gathering of posts and do not allow web scraping.

You also have to consider the ethics of such research. Legally, researchers do not need to get informed consent because users of social media platforms nominally agree to let third parties use their data when they agree to the platform’s terms of service. However researchers should consider what the users’ expectations of privacy were when they posted content and if the content includes identifying or sensitive information.

In some cases it might be acceptable to manually retrieve the data you need, but not when you will be publishing your research. Keep in mind that this would not constitute a scientific or legitimate sampling.

Remember that social media users are only a small subset of the larger population, and that what is trending on social media is not necessarily representative of what is happening in contemporary society at large. Social media platforms are spaces where people with extreme views can share and propagate those views. Many platforms no longer provide fact checking or place any kinds of restrictions or warnings on sharing misinformation or using hate speech. Bots comprise a large number of "users" for some of these platforms.

If you do not need to create a dataset of social media data, consider using ProQuest Congressional to view social media posts from members of Congress and federal agencies. You cannot download this data in bulk and you cannot scrape the ProQuest interface.

  1. Select news & social media from the top navigation bar
  2. Under government social media and website posts, you can select Twitter (Jul 2014-Dec 2018) and/or Facebook (Mar 2014-Jun 2015)
  3. You can search by originating source: member and/or federal agencies and other non-member sources

Related InfoGuides

See these guides for more help with your digital scholarship project.

  • Learn Python for Data. Resources to learn and use the open source programming environment Python for data science
  • Learn R. Resources to learn and use the open source statistical software R
  • Qualitative Research and Tools. Find tutorials about conducting qualitative research, including resources on methodologies and software
  • Text Analysis Tools. A companion to our Text and Data Mining Sources infoguide, this guide will take you through how to use several text analysis tools
  • Text and Data Mining Sources. Access text and data mining sources and text analysis tools

Data & Digital Scholarship Services staff have created many guides about digital scholarship, which you can view here.