Research Guides: Text and Data Mining: Home

Home

What is Text and Data Mining?

Text and Data Mining is the automatic analysis and extraction of information from large numbers of documents or data sets, and is particularly valuable in cases of unstructured data. Information in this guide intersects with concepts from basic programming languages, machine learning, and statistical computing, and is often discussed in the context of data science. Scholars from across disciplines employ mining techniques including the humanities, social sciences, and physical sciences. Please use this guide to find information on licensed content and other datasets, essential tools, training, and helpful resources.

laptop sitting on white table with code on screen

Gender Composition of Scholarly Publications
This study conducted by researchers for the Eigenfactor at the University of Washington looks at the differences in authorship by gender within disciplines. The study was conducted using the JSTOR corpus.
Robots Reading Vogue
This joint project by Lindsay King and Peter Leonard at Yale University shows data mining using the Vogue Archives from ProQuest. * Our license agreement with ProQuest does not allow for TDM. Please contact your librarian for more information.
Data Mining Reveals the Six Basic Emotional Arcs of Storytelling
Scientists at the Computational Story Laboratory have analyzed novels to identify the building blocks of all stories.
Six Degrees of Francis Bacon
Researchers at Carnegie Mellon have used data mining to find relationships between prominent people of early modern England using an amazing visual representation.

Tennessee GIS Data and Resources
by Eric Arnold Last Updated Nov 21, 2025 9132 views this year

Text and Data Mining

Need help?

Need help getting started on text and data mining?

Contact our data services team at

dataservices@utk.edu

Home

What is Text and Data Mining?