Skip to Main Content
Ask A Librarian

Data Repositories and Sources

This guide is designed to provide resources on Data Repositories


Hathitrust Research Center

Hathitrust is a collection of digitized works. This collection can be analyzed using HTRC, a research tool that supports large-scale computational analysis. You must create an account with HTRC to use computational features.

Create a Collection in Hathitrust:

Before you can perform analysis in HTRC you must create a collection of items that you would like to analyze from Hathitrust. You will need to login with your WVU credentials to create a collection in Hathitrust. You can also use publicly available collections.



Use My Collections to view collections that you have created and to create new collections. Use Collections to view all publicly available collections in Hathitrust.


Search Hathitrust

Use the search box to locate items in Hathitrust.

Catalog Search

Searches a catalog records of the items such as title, publisher, and subject terms.

Full-Text Search

Searches both the catalog record as well as full-text that occurs in a work.

Full view only

Will search items that are available to be viewed due to copyright. However, you do need to be able to view text to perform text mining and data analysis on a work.

Note: For best results, make sure the box for Full view only is NOT checked.


Add to a collection

To add items to a collection, check the box next to the item's title (you can also use the Select all on page to select all items). When you have the items selected that you would like to add a collection. Use the grey box above the first result, select the collection you want to add the items to (or create a New collection) and click add.


Sharing Collections

To run text analysis on a collection in HTRC, you will need to use the link for the collection. You can access this link by clicking on the title of a collection in the tabs My Collections or Collections. You can then find the link under the Share box.

Text Analysis using HTRC

Hathitrust Research Center is research tool that supports large-scale computational analysis of the works in the HathiTrust Digital Library to facilitate non-profit and educational research.


Create a Workset

Before you can perform text analysis on a collection that you need to create a workset of the collection in HTRC.


Text Analysis Algorithms

Allows you to access HTRC's built-in tools to run text analysis on a workset.

Create a Workset

Before you can run text analysis on a collection, you must first transform the collection into a workset. To do this, click on the Workset tab.


Create a Workset

To get started click on Create a Workset in the top right hand corner on the page. You can see worksets that you have previously created on this page as well.


Import a collection

You can import collections in multiple ways. The easiest way is to use the Import from Hathitrust.


Fetch Collections

To fetch a collection, go to Hathitrust and choose a collection. On the right hand side of the collection, copy the URL for Link to this collection. Paste this URL into the Hathitrust Collection URL box. Click on Fetch Collection.


Describe Workset

Give the workset a name and description. When you are ready to create the workset, click on Create Workset.