Skip to Main Content
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Learn Python for Data

Resources to learn and use the Open Source Programming Environment Python for Data Science.

General Statistics Charts

Unlike R, in which pretty much everybody uses and loves ggplot2 (including add-on packages), Python has several different options for graphics.

Static Graphing

  • matplotlib - The classic Python graphing library, with fine-grained control over every aspect of a chart. Because of its age, it is the most popular and it has tons of documentation and tutorials. However, newer options tend to be easier to use and produce nicer-looking graphics.
  • pandas - Strictly for dataframes, this package allows for super-concise creation of standard graphics, utilizing matplotlib in the background. Other libraries here are also compatible with Pandas
  • seaborn - Adapted from matplotlib to make exploratority data anlaysis simpler and pretiier, seaborn may still require utilizing matplotlib directly for some tasks. The compatibility also has meant that ease of use is not ideal. But, after many years, a new interface is coming which may further increase it's popularity.
  • plotnine - ggplot2 users switching from R will feel right at home with this near-identical library, whereas native Python users will dislike that the syntax is non-pythonic. Because it is also built upon matplotlib, the graphs look similar and thus are not as pretty as that of alternatives.

 

Interactive Graphing

These packages were designed to make interative plots that embed in web apps by exporting as HTML and Javascript. In most cases, these can also be used to create static graphs as well. 

  • bokeh - A good all-around choice. [Bokeh Gallery]
  • plotly - Available for multiple languages, including Python, R, and other platforms, Plotly also offers various interfaces (including an online chart builder) and grammars (the easy Express vs the full Graph objects). Thus, it could be a good choice for those who need maximum flexibility and customization. It is particularly good at 3D graphs.
  • Altair - Although limited to small datasets (< 5k rows), Altair is still has a devoted following because of the easy syntax and pretty output. [Altair Gallery]

Comparisons and Examples

Special Graphs

Geospatial & Mapping

See also the Geospatial & GIS Infoguide

Network Graphs

  • pgmpy - Bayesian Networks and other Probabilistic Graphical Models.
  • igraph - Network graphs
  • networkx - Network graphs, particularly good for complex networks. 

Dashboards

Dashboards

  • Streamlit
    • Robust and easy to use dashboard and data visualization library.
    • Does not require much HTML or CSS knowledge.
    • Pythonic and intuitive syntax.
  • Dash
    • Dashboard library built on top of Plotly.
    • Recommended to have HTML and CSS knowledge.
    • Syntax can be more difficult than Streamlit.