Research Guides: Text and Data Mining: Data Sources

About Data Sources

UT Libraries current students, staff, and faculty can access licensed data and datasets. Current users can also text and data mine licensed content with selected content providers. Freely available data sources and APIs are also available on the web. Please explore these sources to see what fits your needs. Contact a Librarian for additional help!

Data Sources

China Census Datasets
China Data Online

Users of screen readers and keyboard navigation may require assistance with this resource. Please contact eproblems@utk.edu for help.
Cross-National Time-Series Data Archive
SageData

Screen reader users should avoid using an Android or iOS device on this website.
EventStatus Corpus

You will need to register with a Guest User account with LDC. Make sure to select our Organization: University of Tennessee, Knoxville. Library administration will approve the account. Then, you can access the data directly through LDC.
ICMA Survey Datasets : County form of government
Researcher's dataset (ICRG T3B)
International Terrorism : Attributes of Terrorist Events
Roper Center for Public Opinion Research

If Roper Center content is inaccessible to you, please contact eproblems@utk.edu to request an accessible alternative format.
Users of keyboard navigation or of swipe/gesture navigation on Android/iOS may require assistance with this resource. Please contact eproblems@utk.edu.
SAGE Research Methods. Datasets

If the content of a SAGE Research Methods article is inaccessible to you, please contact eproblems@utk.edu to request an accessible alternative format.
Users of swipe/gesture navigation should avoid using an iPad/iOS device on this website.
Social Explorer
Sociometrics
Foreign Trade Data

The official source for U.S. export and import statistics.
2011 data also available as zip files.
U.S. Historical Datasets

Data must be requested through UT Libraries. We will facilitate access through Data Axle within 1 week of your request.
Users of keyboard navigation or screen reader software may require assistance with this resource. Please contact eproblems@utk.edu for help.
Wharton Research Data Services
Screen reader users or users of keyboard navigation may require assistance and should avoid using Firefox on this website. Please contact eproblems@utk.edu for assistance.

University of Tennessee, Knoxville Libraries Licensed Content

The resources listed on this page may be text and data mined for academic scholarship or educational purposes. The list is organized by vendor/platform based on our UT license agreements with the vendor or publisher.

If you do not see a resource listed here, please contact us and we can investigate further. We will need time to review the license agreement and terms of use, so please plan accordingly. Carrying out automated text and data mining on a database that violates its terms of use is a violation of the University Libraries Electronic Resources Use Policy.

TDM Permitted Content

Adam Matthew

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Adam Matthew. Adam Matthew requires a permission form be filled out and submitted before mining begins.

TDM Statement

Example Permission Form

Contact: info@amdigital.co.uk

Cambridge University Press

Permission provided for non-commercial educational and scholarly TDM from Cambridge University Press's Terms of Use.

Rights and Permissions

Contact: directcs@cambridge.org

Clarivate Analytics

Web of Science's production team can create a custom data set based on set variables for a fee (contact librarian for additional help). Within Web of Science, you can use the Analyze tool to analyze a subset of data within the interface.

Clarivate "API Expanded"

Elsevier

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Elsevier and Elsevier's TDM Policy. Elsevier has a Developers Portal where you register to use their API Key. After registration you must request elevated privileges to receive full access to their data. There is a 20,000 records per week rate limit.

TDM Policy

Developers Portal

TDM Registration Forms

Emerald

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Emerald Publishing. Emerald asks that you notify them before conducting any TDM activities on www.emerald.com/insight to allow Emerald to manage server capacity. This will allow them to enable you to complete your activity without technical obstacles and to maintain access for all Emerald users.

TDM License

Contact: permissions@emeraldinsight.com

Gale

Permission provided for non-commercial educational and scholarly TDM from our license agreement with Gale.

JSTOR

TDM of JSTOR content is permissible according to our License Agreement with some restrictions. Data for Research (https://www.jstor.org/dfr/) is JSTOR's TDM service. Datasets must be requested through JSTOR and are processed by JSTOR. Datasets are free and may include data for up to 25,000 documents. See their site for more info on creating datasets, specifications, requests, and sample datasets.

Dataset Services

Sample Datasets

Dataset Request Form

JSTOR Data for Research

Technical Specifications

Oxford University Press

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Oxford University Press.

Rights and Permissions

Contact: Data.Mining@oup.com

Project Muse

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Project Muse. Project Muse requests that you contact them before beginning mining.

FAQs

ProQuest

Text and data mining of our subscribed ProQuest content is available through ProQuest TDM Studio. Simply create an account and begin building datasets.

ProQuest TDM Studio

SAGE

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with SAGE . See TDM Info for request limits and other API information.

TDM Info

TDM License

Springer Nature

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with SpringerNature. However, there are restrictions on storage of data. Springer Nature has a TDM Attachment to their License Agreement. Contact your librarian for more information.

Taylor and Francis

Experience with Taylor and Francis shows that they are willing to accept TDM of their products when informed of the research involved and time frame. Contact your librarian for help.

Terms and Conditions

University of Chicago Press

Permission provided for non-commercial educational and scholarly TDM from University of Chicago Press's Terms and Conditions. Specifically requests for user to contact them for approval.

Terms and Conditions

Wiley

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Wiley.

TDM Policy

Contact: TDM@wiley.com

TDM Not Permitted

EBSCO

TDM of EBSCO content is not permissible at this time.

Newsbank

TDM of NewsBank content is strictly prohibited in our License Agreement. Mining of NewsBank content will have an additional cost attached to it and may require additional licensing. Please contact your librarian for more information.

HathiTrust makes the texts of public domain works in its corpus available fro research purpose. The works fall into two categories: non-Google digitized volumes, which are freely available, and Google-digitized volumes, which are available through an agreement with Google.

American Physical Society offers APS Data Sets for Research- The corpus of Physical Review Letters, Physical Review, and Reviews of Modern Physics is comprised of over 450,000 articles and dates back to 1893. Researchers may now request access to this data by filling out a simple web form. The requesting researcher must accept the terms and conditions governing the use of the data sets. Requests will be quickly reviewed and, if approved, the data will be made available for download after accepting the terms and conditions. Contact data-requests@aps.org with any questions.

Kaggle provides free access to datasets and data science training courses for a variety of languages and technologies. Kaggle offers a no-setup, customizable, Jupyter Notebooks environment. Access free GPUs and a huge repository of community published data & code. You can also register with an account to keep your work.

Text and Data Mining

Need help?

Need help getting started on text and data mining?

Contact our data services team at

dataservices@utk.edu