Text mining and analysis is used for identifying major trends across a large number of documents. Text mining is performed by using software or a programming language (e.g., Python) to analyze a corpus of text in order to identify key trends, such as word usage, or vocabulary changes over time. One example is showcasing the number of times a specific word appears in all of Shakespeare's plays. Instead of physically reading through all of the plays and counting by hand, computers can perform the same task in a fraction of the time.
Plan. The first thing to do before starting any text mining project is to plan what your final product will be. The final product is typically framed by a research question that lends itself to a particular tool or visualization.
Clean. Cleaning and parsing the text before uploading texts to any tool will help to streamline the process.
Limited Access Data Sets lists
Citing Data provides detailed information and examples on how you should cite data in your research.
Data Visualization guidelines and best practices for visualizing data with a wide range of tools.
Find Data for Analysis: Data for Practice & Projects links to resources that include text corpora among their list of data sources.
Research Data Management Basics for understanding best practices for managing your data and keeping organized.
Software for Digital Scholarship see R, NVivo, and QDAMiner.