The Old Scholar's Historical Thoughts

October 18, 2009

Project Outline

Filed under: Project — theoldscholar @ 6:16 pm

ABSTRACT: Develop an interactive, collaborative location to gather information about the many different XML, DTD, and data sharing standards that exist within the Digital Humanities community. This project is the first phase of a much larger effort. This phase is the collection of known standards and soliciting feedback from the digital humanities community. This collection will also become an area where developers can learn how to make their collections and sites friendly to automated discovery and use.

Depending upon feedback this overall effort can then branch into a concerted effort to develop semantic web standards for History or Digital Humanities, or on a smaller scale develop middle layer tools that will allow one domain to interoperate with another.

WHY IS THIS PROJECT NEEDED: The National Endowment for the Humanities funds the development of innovation in the digital humanities. Many of the projects state that they want to enhance the ability of researchers and scholars to interchange information. In many cases they create a new set of meta-data and definitions that enable workers in their particular field to share data. In the awards for last year the Alexandria Archive Institute was granted and award to develop a way to share data between archeological sites. They developed a set of metadata keywords and content.  The University of Indiana developed an ontology for philosophical thought. Duke University developed an extension to TEI which includes a set of rules for encoding Whitman poems. The problem with this approach is that there are so many standards they cannot work together in a collaborative, automated way. The term “Place” for an Archeologist has been superseded by the term “site.” However, a site exists at a “location,” and does not mean a web site where more information is available. However, in philosophical thought “place” is meant to be the city where something happened, and not the country where an archeological “site” was discovered. A human can easily see what is meant in each case, but a computer needs to know that site means one thing in one collection of data and something else in another collection of data. An effort to overcome these problems is the Semantic Web effort. The main thrust of this effort is in the government and business community, although the Smithsonian Institute is currently working on promoting the use of the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM) Conceptual Reference Model (CRM) which is an ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information.

Someone who wanted to create a new web site and make their research, artifacts, and analysis available as services to other digital humanists would not know where to start. If the site only contained documents the TEI would be a good place to start, but the web is so much more than the digitization of documents. This project is the first step in an effort to allow these many different standards to interoperate.

FEATURES/FUNCTIONALITY: The initial effort would be in the development of a presence which would provide links to all known XML and DTD standards for the Digital Humanities with emphasis on History. These links would include a short description of the information, status, domain space, and functionality of the site. Each entry would also provide a place for users to comment on their individual experience with the technology defined in the link. This link page would also include a method for users to nominate other web sites to be included in the list, along with the reason they feel the site should be included. There would be another list of technology tools that have been created to assist in the creation of XML which follow the specific schemas and rules of the domain. Again there would be a place where users could nominate other sites. Another section would be an overview of the project and proposed course of action to continue the development of this effort. This section would be a wiki which would have a main page of project plan and rationale.

In a completely separate section, focused on an entirely different audience, would be a set of easy “how-tos” or links to easy to understand information on the use of machine readable information on web pages. This would include links to the API primer on the ProfHacker web site, etc. Sort of a one stop shop for how to make your web site/museum collection/document repository discoverable and useable by automated programs. It would also bring out the advantages and disadvantages of using different tools to make sites. This area would also have an area for users to nominate sites, locations, books for others to use. In this area I foresee a grading system being employed shoing users what was most beneficial to other users.

AUDIENCE: There are two main audiences for this site. The first one is those people who are interested in promoting the free flow of information between sites that can be accomplished through the use and power of automated programs. These users include those organizations which have already spent considerable effort to develop their reference models and would like to have a different place to advertise their existence. For instance the CIDOC CRM has been over a decade in development. The George Mason Center for History and New Media (CHNM) provides a well known and respected location for people to come to participate in the development of a more universal definition.

The second audience is for people who want to create a web site and have their work be accessed and used by many others. Stand alone web sites with static data are limited in functionality. Researchers are always looking for ways to make their research more available and used and if they can easily enhance their web sites and data collections they would do so. This site would show them how to do that.

TECHNOLOGIES TO BE USED: This site would take advantage of basic html, wiki, and would of course make the information contained, accessible through standard APIs.

USER-CONTRIBUTED/INTERACTIVE ELEMENTS: The basic functionality of this site is the user interaction. Users input will be in all sections of the site. In some cases it will be the nomination of items, in others it will be interactive feedback on what was useful and what wasn’t. In order to provide some level of control over technology zealots all input would be moderated before being posted to the site.



  1. I love standards…hence I love this project.

    Comment by colamaria — October 18, 2009 @ 8:20 pm | Reply

  2. John-

    This is a great encapsulation of what looks to be a theme amongst several of us. And who says I am the show-off here?!

    Establishing standards-based approaches to the data in easily extensible formats such as properly tagged XML accomplishes two great goals: removes the necessity of stove-piped applications in the “build it and they will come” mantra, and expands sharing across many previously unreachable (or unknowable) domains — especially for smaller institutions and persons.

    I would like to try to collaborate our projects together so that, if nothing else, we begin creating the best standards we can come up with and diligently exercise peer review within our group. Best case, we attract some attention and actually make some headway on data standards and collaboration standards development.

    Great start!


    Comment by DeadGuyQuotes — October 20, 2009 @ 11:46 am | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Create a free website or blog at

%d bloggers like this: