Skip to Main Content
George Mason University InfoGuides

Software: Learn Python for Data

Resources to learn and use the Open Source Programming Environment Python for Data Science.

The following books (and their associated materials) are Open Educational Resources (OER), available for free to be used, shared, and even remixed.

Python for Everybody

See all OER materials including Video, Audio, Lecture Slides & Handouts, Sample Code, and the Github Repository

Think Python

Learn about programming concepts like iteration and classes, using Python. Used by introductory classes at Mason. Originally called "How to Think Like a Computer Scientist".

Learn Python the Right Way

Uses the online IDE Replit instead of needing to install Python and adds projects for practice. Available in Markdown on Github

OER Classes + Lectures

Open Courses

These tend to be computer science classes that happen to use Python, and thus are more in-depth than many data analysts need to know. But, they are still wonderful classes.

Books on Data Science

Data Science

These books focus on data management, and sometimes analysis. 

Has sections on NumPy, Pandas, Matplotlib. Covers data management and exploration (not statistical modeling or testing). See Jupyter notebooks that make up the entire book at the author's Github repository.

Introduction to exploratory data analysis, as tends to be done in the social and health sciences. Covers NumPy, pandas, SciPy, MatplotLib, and some statsmodels for Regression and time series. 

Code in both R & Python and on Github. Assumes familiarity with Python and statistics. Covers exploratory data analysis, sampling distributions, significance testing, regression, classification, and both supervised and unsupervisd learning. Uses scikit-learn, supplemented by statsmodels.