Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Working with Data

What you need to know for Data Management and Data Wrangling

Overview

If you come across a website displaying data you would like to use in your own analysis:

  1. Does the website allow you to download the data in a format like XML or CSV?
    • If not, can you find the data you need elsewhere (Google, Governments, International Organizations, Library, etc.)?
  2. Does the website offer an API? This may not be immediately obvious and might require a bit of research.
  3. If neither of those options are available, you might consider web scraping.

Both APIs and web scraping have 2 parts:

  1. Make the request: specify a URL (yes, a normal URL)
    • For web-scraping, it is the same url you use in a web browser (because it “returns” an HTML file) – Easier
    • For APIs, the URL would point to their API processor and have keys and values specifying what you want – Harder
  2. Process the response: save the file you get and extract the data
    • For web-scraping, you receive an HTML file with the web content which needs to be parsed to extract the data – Harder
    • For APIs, you receive a file in a different format (often XML or JSON), which gives clean easy-to-access data – Easier

The Basics

Go through these tutorials before we meet up.

Learn More

APIs

Web Scraping