Skip to Main Content
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Software for Digital Scholarship

Information about DiSC-supported software for the collection, processing, analysis or display of numeric, text, or geospatial data

Free Text Mining Tools

The following web-based tools, software programs, and programming languages are used for text analysis. 

  • AntCont. A free text corpus analysis toolkit for concordancing and text analysis.
  • Concordle. Creates word clouds based on the user's corpus.
  • Lexos. Allows users to upload their corpus, clean the documents, and then perform visualizations or analyze them. Files are limited by size and type.
  • MALLET. Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
  • Python. A programming language that is used by many for text mining and analysis. The Programming Historian has a series of lessons on using Python for manipulating and analyzing text data. 
  • R and RStudio. Open source statistical analysis software that rely on community driven packages to mine data. R is script heavy, meaning a programming background is highly recommended, but it offers the most flexibility with mining as well as creating visualizations of the results. See the book Text Mining with R and The Programming Historian's "Basic Text Processing in R." 
  • Text Analysis Portal for Research (TAPoR). Discover research tools for studying texts including detailed search and curated lists.