Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Learn R

Resources to learn and use the Open Source Statistical software R (R-Project)

Choose from the following: 

  • dplyr / tidyr (tidyverse) - most popular, our recommended choice
  • data.table – powerful and flexible referencing, with shorthand notation for programmers: ​mydata[rows, columns, by
    • Example (changes values to "VA"):  mydata[state == "Virginia", state := "VA"]
    • Preferred by data Scientists and those who work with Big Data (> 1 million records) (comparison with dplyr)
    • Learn to work with large dataset in R by Analytics Vidhya
  • sqldf – lets you use SQL on R datframes. Use this only if you don't intend to learn R. If you just want to use data that is in a database, you can use dplyr with dbplyr and the DBI package

The Tidyverse

The Tidyverse: dplyr, tidyr and many more

  • dplyr is used to manipulate one dataset
  • tidyr is used to combine datasets or restructure a dataset