Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Software for Digital Scholarship

Information about DiSC-supported software for the collection, processing, analysis or display of numeric, text, or geospatial data

Versatile Software

These software offer a wide range of data tools, including data cleaning, summary statistics, with many also providing statistical analyses and visualizations. 

Spreadsheet Software

  • Spreadsheets (Microsoft Excel, Google Sheets, OpenOffice Calc)
    • These general spreadsheet softwares are flexible and familiar, and while it may be able to do other functions, they are not the best nor do they encourage best practices. Do use Pivot Tables and/or PowerQuery (Excel) to help maintain data integrity and replicability. 
  • PowerBI (Microsoft)
    • Free to all for all functionality except sharing dashboards, this offshoot of Microsoft Excel made for tidy tabular data combines the data cleaning ability of PowerQuery with more appropriate visualizations for academia and business than Pivot Charts. 
  • OpenRefine
    • Originally created by Google, this powerful free and open-source data cleaning software runs in a local browser window (i.e., data stays on your own computer). With faceting, clustering, and an underlying language offering flexible replicability, this beats PowerQuery for exploring and fixing particularly messy data. Notably absent is the ability to merge (add variables), but appending is seamless. 
  • Tableau
    • Free for academic projects and teaching, this otherwise expensive business software focuses on providing beautiful and interactive visualizations and dashboards. With the newer Prep software, it can handle additional data cleaning tasks. 

Programming Languages

R vs Python (Medium)

  • R / RStudio
    • This statistical language can be easier for non-programmers to learn than Python and is the best choice if you are almost always working with tabular data (Python tends to be better for non-tabular data, but either can do the basics). 
  • Python
    • This full-fledged programming language can ultimately do anything with any type of data and is far easier than other languages. But, it can be finicky and has a steeper learning curve than the other tools. 

Working with Data InfoGuide

Working with Data InfoGuide

This guide is to learn basic skills in data cleaning and data wrangling including how to use a computer.