Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Text & Data Mining Sources

Access text and data mining sources and text analysis tools.

Accessing APIs

See the Web Scraping page on the Working with Data infoguide for information and resources on what APIs are and how they work. 

If you need assistance working with these APIs, please email datahelp@gmu.edu

Some of this information is taken from MIT Libraries' APIs for Scholarly Resources

CORE

CORE is the world's largest collection of open access research papers. CORE provides an API offering access to metadata and full texts of research papers.

Digital Public Library of America (DPLA)

The Digital Public Library of America (DPLA) brings together the riches of America's libraries, archives, and museums, and makes them freely available to the public. DPLA’s API provides programmatic search and access to every item in the DPLA catalog. 

Elsevier Research Products APIs

Elsevier provides access to researchers and healthcare professionals. Elsevier has APIs available for many of their research products including ScienceDirect, Scopus, Engineering Village, Embase, SciVal, PharmaPendium, Geofacets, and SUSHI.

Internet Archive

The Internet Archive is a non-profit library of millions of free books, movies, software, music, websites, and more. The Internet Archive allows for bulk download and API access. 

Library of Congress 

The Library of Congress serves as the research arm of Congress. Multiple APIs are available to download bibliographic data and search Library of Congress digital collections, including images, public radio and television, and historic newspapers. 

  • Getting started: Varies by API used. Most APIs do not require an API key. 
  • Result format: Varies by API used
  • Limitations: Not specified. Varies by API used.
  • For more information: LC for Robots

New York Times

A leading news source throughout the globe. New York Times Developer Portal allows you to access several APIs for mining New York Times publication data. 

  • Getting started: You need to create an account, register an app, and access the API keys. 
  • Result format: JSON. Some APIs return other formats.
  • Limitations: 4,000 requests per day and 10 requests per minute
  • For more information: New York Times Developer FAQs

PLOS

PLOS is a nonprofit open-access science, technology, and medicine publisher. PLOS Search API gives developers access to rich data that can be flexibly integrated into applications for the web, desktop or mobile devices. 

  • Getting started: Create a PLOS Journals account and then an API key will be included as a part of your user profile. 
  • Result format: XML
  • Limitations: 7200 per day, 300 per hour, 10 per minute, and allow 5 seconds for your search to return results 
  • For more information: PLOS API FAQ

Springer Nature

Springer Nature is a leading global scientific publisher of books and journals. Springer has created multiple APIs for developers to access their freely available content for noncommercial use.