Skip to main content

Text and Data Mining: Data Sources

Data Sources

University of Tennessee, Knoxville Licensed Content

Please see the table below for information about text and data mining (TDM) of licensed electronic content.

Regardless of the licensed content you choose to text and data mine, we recommend contacting your librarian so that they may inform the vendor's representative. This will prevent the misuse of licensed content and ensure the proper permissions are being followed. Misuse of licensed content can result in the vendor denying access for all of The University of Tennessee. See our Electronic Resources Use Policy for more details.

If you do not see a resource listed here, please contact us and we can investigate further.

 

Licensor/ Subject
 Permissible/ Fees?
Specifics
Helpful Links, Contacts, & API

Adam Matthew

History; Literature; Popular Culture; Primary Sources

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Adam Matthew. Adam Matthew requires a permission form be filled out and submitted before mining begins.

TDM Statement

Example Permission Form

Contact: info@amdigital.co.uk

Cambridge University Press

Multidisciplinary

Yes/ Free Permission provided for non-commercial educational and scholarly TDM from Cambridge University Press's Terms of Use .

Terms of Use

Rights and Permissions

Contact: directcs@cambridge.org

EBSCO

Multidisciplinary

No/ NA

TDM of EBSCO content is not permissible at this time.

 

Elsevier (Science Direct)

Physical Science & Engineering; Life Sciences; Health Sciences; Social Sciences and Humanities

Yes/ Free with registration

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Elsevier and Elsevier's TDM Policy. Elsevier has a Developers Portal where you register to use their API Key. After registration you must request elevated privileges to receive full access to their data. There is a 20,000 records per week rate limit.

TDM Policy

Developers Portal

TDM Registration Form

TDM Basics

Emerald

Humanities; Business; Social Sciences; Information Sciences; Engineering

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Emerald Publishing. Emerald asks that you notify them before conducting any TDM activities on www.emerald.com/insight to allow Emerald to manage server capacity. This will allow them to enable you to complete your activity without technical obstacles and to maintain access for all Emerald users.

TDM License

Contact: permissions@emeraldinsight.com

Gale Primary Sources

Multidisciplinary

Yes/ Free with account creation

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Gale. 

UTK has access to Gale's Digital Scholar's Lab. The Digital Scholar Lab gives users the ability to create custom content sets containing as many as 10,000 documents.

Gale is still accepting feedback from users for this product. Please contact your librarian with relevant feedback for updates.

Digital Scholar's Lab

JSTOR

Humanities; Social Sciences; Sciences

Yes/ Free

TDM of JSTOR content is permissible according to our License Agreement with some restrictions. Data for Research (https://www.jstor.org/dfr/) is JSTOR's TDM service. Datasets must be requested through JSTOR and are processed by JSTOR. Datasets are free and may include data for up to 25,000 documents. See their site for more info on creating datasets, specifications, requests, and sample datasets.

Dataset Services

Sample Datasets

Dataset Request Form

JSTOR Data for Research

Technical Specifications

NewsBank

National and International Newspapers

No TDM of NewsBank content is strictly prohibited in our License Agreement. Mining of NewsBank content will have an additional cost attached to it and may require additional licensing. Please contact your librarian for more information.   

Oxford University Press

Multidisciplinary

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Oxford University Press.

Rights and Permissions

Contact: Data.Mining@oup.com

ProQuest

Multidisciplinary

No/ NA

TDM of ProQuest content is strictly prohibited in our License Agreement.

 

Project Muse

Multidisciplinary

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Project Muse. Project Muse requests that you contact them before beginning mining.

FAQ's

SAGE

Business; Humanities; Social Sciences; Science; Technology; Medicine

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with SAGE . See TDM Info for request limits and other API information.

Terms of Use

TDM Info

TDM License

Springer Nature

Multidisciplinary

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with SpringerNature. However, there are restrictions on storage of data. SpringerNature has a TDM Attachment to their License Agreement. Contact librarian for more information.

 

Taylor & Francis

Multidisciplinary

Yes/ Free

Experience with Taylor & Francis shows that they are willing to accept TDM of their products when informed of the research involved and time frame. Contact your librarian for help.

Terms and Conditions

University of Chicago Press

Art and Art History; Economics; Education; History; Humanities; Law and Politics; Medieval and Renaissance Studies; Science; Social Sciences 

Yes/ Possible administrative costs

Permission provided for non-commercial educational and scholarly TDM from University of Chicago Press's Terms and Conditions. Specifically requests for user to contact them for approval.

Terms and Conditions

Web of Science, Clarivate Analytics

Multidisciplinary index

 

No, but see Analyze Tool Web of Science's production team can create a custom data set based on set variables for a fee (contact librarian for additional help). Within Web of Science, you can use the Analyze tool to analyze a subset of data within the interface.

Analyze Tool

Product Terms (see Custom Data Sets)

Clarivate Analytics Terms (see Unauthorized Technology)

Wiley

Multidisciplinary

Yes/ Free

Permission provided for non-commercial educational and scholarly TDM from our License Agreement with Wiley .

TDM Policy

Contact: TDM@wiley.com

 

Again, we highly recommend contacting your librarian to get approval to mine licensed content. Click the Tools tab to see more information about other mining options.

 

  Twitter API

Twitter has an API (Application Programming Interface) which provides access to Twitter data in machine readable format. The free version of API access is called the Standard tier. You will need to register a Developer Account in order to gain access to the API. The complete documentation is available here, but some of the most useful documentation topics include:

The main idea of the API is that you construct HTTP requests using the parameters described in the search endpoints documentation, and get back your results in JSON format. Some people may choose to just put together their own scripts using the appropriate tools for their language, such as Python requests library and passing the response text to be loaded using standard library JSON processing tools. Another option is to look around the community for more specialized tools such as the twitteR package for R and NLTK's twitter package.

 

 

kaggle logo Kaggle  provides free access to datasets and data science training courses for a variety of languages and technologies. 
   
   

hathi trust

 

 

 

APS Physics logo

HathiTrust makes the texts of public domain works in its corpus available for research purposes. The works fall into two categories: non-Google-digitized volumes, which are freely available, and Google-digitized volumes, which are available through an agreement with Google.

 

 

 

American Physical Society offers APS Data Sets for Research- The corpus of Physical Review Letters, Physical Review, and Reviews of Modern Physics is comprised of over 450,000 articles and dates back to 1893. Researchers may now request access to this data by filling out a simple web form. The requesting researcher must accept the terms and conditions governing the use of the data sets. Requests will be quickly reviewed and, if approved, the data will be made available for download after accepting the terms and conditions. Contact data-requests@aps.org with any questions.

‚Äč

Loading ...

Librarian Contact for This Guide

Monica Ihli's picture
Monica Ihli
Contact:
865-974-2876

Licensed Content Questions

Lizzie Gallagher's picture
Lizzie Gallagher

For questions about licensed content such as requesting content extract from one of our licensed content providers, please contact your Electronic Resources Assistant Librarian.