SEASR - Retreat Report
An analytical platform for the analysis of rich media content.
Project Name and Start Date
SEASR (Software Environment for the Advancement of Scholarly Research) June 2007
Project URL
Coming soon!
Brief description of project goals
SEASR will deliver a means of addressing the challenges of transforming information into knowledge by constructing the software bridges that are required to move from the unstructured and semi-structured data world to the structured data world. We aim to make content collections more useful by integrating two well-known research and development frameworks--NCSA's Data-To-Knowledge (D2K), and IBM's Unstructured Information Management Architecture (UIMA)--into an easily usable analytical platform that researchers in any discipline, but particularly the humanistic fields, can easily learn and adapt for their own scholarly research.
This project will focus on developing, integrating, deploying, and sustaining a set of reusable and expandable software components and a supporting framework, SEASR that will benefit a broad set of data mining applications for scholars in humanities.
The key goals established for this effort are a set of software centric directives:
-
Support the development of a state-of-the-art software environment for unstructured data management and analysis of digital libraries, repositories and archives, as well as educational platforms that are expected to contribute to many of the humanities breakthroughs of the 21st century.
-
Support the continued development, expansion, and maintenance of end-to-end software system - user interfaces, workflow engines, data management, analysis and visualization tools, collaborative tools, and other software integrated into a complete environment SEASR - to bring the full power of data analytics to the scholars.
-
Support education and training for use of this software environment for analysis through workshops to promote its usage among scholars.
There are two important additional benefits from the development of SEASR: the creation of a vibrant digital humanities community, and technology transfer between diverse disciplines that traditionally have had little interaction. SEASR will create a venue for technology and information exchange, offering scholars access to an interdisciplinary collaboration among humanists, computer scientists, and high-performance computer specialists.
Participating Institutions and key people
-
Michael Welge, . National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign PI, Project and Technical Leadership
-
Loretta Auvil, . National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Co-PI, Community Outreach and Applications
-
John Unsworth, . Graduate School of Library and Information Sciences (GSLIS) University of Illinois at Urbana-Champaign Co-PI and Community Advisor
-
Duane Searsmith, . University of Illinois, Technical Lead
-
Tara Bazler, . User Experience Group, Indiana University Usability Evaluation
-
Tim Cole, . Mathematics Librarian and Professor of Library Administration University of Illinois at Urbana-Champaign Community Advisor
Highlights
* Background References
Welge, M., L. Auvil, A. Shirk, C. Bushell, P. Bajcsy, D. Cai, T.
Redman, D. Clutter, R. Aydt, and D. Tcheng. (2003). Data to Knowledge
(D2K) Automated Learning Group, National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign.
D2K: (alg.ncsa.uiuc.edu/do/tools/d2k/)
Downie, S., Unsworth, B., Yu, B., Tcheng, D., Rockwell, G. & Ramsay.
S. (2005). A revolutionary approach to humanities computing?: Tools development and the D2K data-mining framework. In proceeding of the ACH/ALLC 2005 conference. M2K: (www.music-ir.org/)
Ferrucci, D. & Lally, A. (2004). UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering, 10:3-4, pp. 327--348.
UIMA: (www.research.ibm.com/UIMA/)
Milestones and deliverables
FY07-Q3
-
Announcement of the SEASR research and development activity at International Digital Humanities conference and solicit feedback from the Community.
-
User centric activities - functional, data-related, user interface, and usability requirements from individuals, focus groups, and on-going community efforts.
-
First meeting to review project plan with Advisory Group
-
Finalization of software infrastructure design and develop plan
FY07-Q4
-
Integrate D2K components for Nora into chosen SEASR framework
-
Release SEASR User pre-alpha 1 to advisors for review
Community Plan
We plan to select advisors who have expertise in humanities computing and experience with relevant software development projects. Selected advisors will have to agree to make one site visit a year and provide a written evaluation of the current state of the project. The feedback provided by these expert advisors will be used to expand the horizons of SEASR. These evaluations will also be included in project reports to the Mellon Foundation. We plan to engage at least three distinguished non-UIUC advisors along with the listed UIUC advisors.
They will be selected and their commitments will be obtained no later than Year 1, Q1. The non-UIUC advisors will be provided with travel expenses and an honorarium will be provided for their written evaluation of the project. To date, the community advisors are listed below, we plan to expand the list to include other Andrew W Mellon funded efforts:
-
David Ferrucci, IBM T.J. Watson Research Center
-
Eric W. Brown, IBM T.J. Watson Research Center
-
John Wilbanks, Science Commons
-
Vernon Burton, Department of History, University of Illinois, Urbana-Champaign
-
J. Stephen Downie, GSLIS, University of Illinois, Urbana- Champaign
Sustainability Plan
Our vision includes the formation of a community that can extend and support SEASR going forward. We want to engage both researchers and their supporting institutions in the adoption of SEASR prior to its formal release. We plan to proceed with this in two different ways.
First, we will gather user community developer support through advisors. Second, we will request representatives from the content infrastructure maintainers to review SEASR and coordinate with their library project in exchange for a small level of funding.
Synergy opportunities with other projects
To ensure a strong synergy between scholars and technologist in the development of SEASR, we plan to interact with as many end users and development teams as possible. Below is a partial list of end users and developers we plan to contact. If you would like to share your use cases or development experiences, participate in a technology collaboration, or learn more, please contact Loretta Auvil, Application and Community Lead for SEASR ().
End Users
Caroline Haythornthwaite, UIUC, Matthew Kirschenbaum, University of Maryland (Nora and Monk), Martin Mueller, Northwestern University (Monk and Wordhoard) Steve Ramsay, University of Nebraska (Nora and Monk), Martha Nell Smith, University of Maryland (Nora and Monk), Sara Steger, University of Georgia (Monk)
Designer/Developers
John Norstad, Northwestern University (Monk, Wordhoard), Bill Parod, Northwestern University (Monk, Wordhoard), Geoffrey Rockwell, McMaster University (Monk, TAPOR), Stefan Sinclair, McMaster University (Monk, TAPOR, HYPERPO), Thorny Staples, University of Virginia (FEDORA), Paul Watry, Cheshire (National Text Mining Centre)
