Text Mining and Analysis

The Center for Digital Scholarship supports the analysis of both structured and unstructured texts. For example, sets of digitized materials such as books or journal articles can be analyzed with Center-provided services allowing researchers to count, tabulate, measure, graph, chart, classify and thus analyze volumes of content. The tools may uncover textual patterns, trends, or anomalies and result in new perspectives on larger sets of data. Center staff members are experienced with text parsing and cleaning as well as summary and analysis methods. Center staff can also assist in securing use rights to large data sets on behalf of researchers.

The Center is equipped with high performance Macintosh and Windows computers with hardware and software to digitize texts, convert scanned images into textual materials, and mark up text with XML. Sets of open source software are available for doing analysis and investigation. Some of these tools are also accessible via the Center’s classroom.

For more information, read our one-page PDF document on text mining, and to follow the University's growing group of digital humanists, read the ND DH blog.

 

For information about text mining and analysis, please contact:

Eric Lease Morgan, Digital Initiatives Librarian
emorgan@nd.edu | ph. 574-631-8604