Topic Modeling Workshop

Topic Modeling for Humanities Research, a one-day workshop directed by Assistant Director of MITH Dr. Jennifer Guiliano, received a Level 1 Digital Humanities start up from the National Endowment for the Humanities on April 19, 2011. The workshop will facilitate a unique opportunity for cross-fertilization, information exchange, and collaboration between and among humanities scholars and researchers in natural language processing on the subject of topic modeling applications and methods. The workshop will be organized into three primary areas: 1) an overview of how topic modeling is currently being used in the humanities; 2) an inventory of extensions of the LDA model that have particular relevance for humanities research questions; and 3) a discussion of software implementations, toolkits, and interfaces. Despite—or perhaps because of—the relatively widespread use of topic modeling for text analysis in the digital humanities, it is common to find examples of misapplication and misinterpretation of the technique and its output. There are a number of reasons for this: existing software packages generally have a significant learning curve, most humanists do not have a clear understanding of the underlying statistical methods and models, and there is still limited documentation of best practices for the application of the methods to humanities research questions. As a result, the most promising work in topic modeling is being done not by humanists exploring literary or historical corpora but instead by scholars working in natural language processing and information retrieval. This workshop will address these issues by providing an opportunity for humanists and scholars working in natural language processing jointly to identify potential areas of research and development within applications, extensions, and implementation of topic modeling. Topic Modeling in the Humanities will provide humanities scholars with a deeper understanding of the vocabulary of LDA topic modeling (and other latent variable modeling methods) and best practices for interpreting the output of such analysis, and will articulate fundamental literary and historical questions for researchers outside of the humanities who are developing the models and methods (as well as the software implementations).

Speakers

Matthew Jockers
Department of English and Center for Digital Research in the HumanitiesUniversity of Nebraska
Robert K. Nelson
Assistant Professor of the Digital Scholarship LabUniversity of Richmond

Robert K. Nelson is the Director of the Digital Scholarship Lab at the University of Richmond. He has directed a number of digital humanities project including “Mining the Dispatch,” “Redlining Richmond,” and the History Engine. He holds a PhD in American Studies from the College of William & Mary. His work on nineteenth-century cultural and literary history has appeared in the Journal of Social History and American Literature.

Jordan Boyd-Graber
Jordan Boyd-Graber
Assistant ProfessorSchool of Information Studies and Institute for Advanced Computer Studies (UMIACS)University of Maryland

Jordan Boyd-Graber is an assistant professor in Maryland’s iSchool and UMIACS, and a member of the Cloud Computing Center and the Computational Linguistics and Information Processing (CLIP) Lab. His research applies statistical models to natural language problems in ways that interact with humans, learn from humans, or help researchers understand humans. Jordan is an expert in the application of topic models, completely automatic tools that can discover structure and meaning in large, multilingual datasets. He is a contributor to the Natural Language Toolkit (NLTK), a popular tool used in natural language education research. Jordan received his PhD from Princeton University in 2010, advised by Dave Blei, and has bachelors degrees in history and computer science from the California Institute of Technology. He received a best student paper honorable mention at NIPS 2009 and a Computing Innovation Fellowship (declined). His current work is supported by NSF, IARPA, and ARL.

Jo Guldi
Department of HistoryBrown University
Christopher Johnson-Roberson
EthnomusicologyBrown University
David Mimno
David Mimno
Postdoctoral ResearcherPrinceton University