The Old Scholar's Historical Thoughts

November 29, 2011

HI 731 Trials

I have completed a draft of my paper and welcome feedback from everyone. As I completed the paper I was more intrigued than ever on why England didn’t have a revolution in the nineteenth century. This “last revolution” would seem to provide all the ammunition people needed to distrust the government and clamor for change. Yet it did not happen. I know this topic has been studied extensively. Working on this paper makes me want to read and investigate those studies.

November 14, 2009

Another Project Idea

Professor Cohen pointed me to some sites which are doing some of the same things I am trying to do with my project of data standarization. He suggested that I try to cover the reasons data standardization has failed in the past and why people do not use standards that are available.

That has got me thinking of including in my proposal a workshop with the commander of the Joint Interoperability Test Command and the people in charge of conformance testing for imagery and intelligence projects with people from the Humanities. This workshop would discuss the problems that the Department of Defense has tackled in setting, enforcing and encouraging the development of standards.  And instead of just the government and academia it would be nice to get the people who set the standards for Business to Business communication (B2B) to participate. America’s entire business model depends upon these systems exchanging unambiguous data. I doubt if we can put bar codes on historical artifacts but I can see some areas where the problem space is the same. Interoperability of data standards is a problem that has NOT been solved in many places. Getting different viewpoints may benefit everyone. The output of the workshop could be a definition of the top 10 stumbling blocks and plans of action to tackle these stumbling blocks within the Humanities.

Does anybody else think this is a good idea? The Humanties have many different standards for specific problem spaces, i.e TEI, Museum Artifact Standards etc. What I am proposing is that these different areas step back, and get different ideas from outside their problem space. I am immersed in this stuff all day and think I see the relevance. Do you think all these people from the different organizations would even want to work together? I think the DoD would be willing to work on this; after all the Internet began out of a defense project. Lessons learned by organizations dedicated to interoperabilty may be useful as the Humanites tackle the same problem space.

November 8, 2009

Another possible source of background material

Google Reader gives me suggested blogs that I might be interested in built on what I have been searching for and reading off of the reader. Today they gave me a link to a project called DARIAH which is Digital Research Infrastructure for Arts and Humanities. The DARIAH project is very interesting to me -probably not so much to most everybody else.

But what is interesting is the Arts and Humanities.Net Web site, that they originally pointed me to.  You can go to their Tools tab and get a rundown of the different types of tools used to digitize artifacts, store data, etc. You can go to projects and look up projects that do things that are close to what you are doing for your project. They discuss the project, the different areas the project touches and then give you a link to the projects web site. There were some interesting digitization projects that incorporate the digitization of old records and the incorporation of Web 2.0 features. For instance there was The Old Bailey, which digitized the criminal proceedings of London courts (Digitization), added keyword searches (Data Mining tools) and set up a Wiki for people to add information about the people discussed in the proceedings (Web 2.0)

There were different map projects  one of which looked at medieval town plots in Wales and created 3d rendtions of them. There were projects that put up art works with discussions about them. This might be a place where you can find more information about types of projects already being done for your paper.  The site was

October 18, 2009

Project Outline

ABSTRACT: Develop an interactive, collaborative location to gather information about the many different XML, DTD, and data sharing standards that exist within the Digital Humanities community. This project is the first phase of a much larger effort. This phase is the collection of known standards and soliciting feedback from the digital humanities community. This collection will also become an area where developers can learn how to make their collections and sites friendly to automated discovery and use.

Depending upon feedback this overall effort can then branch into a concerted effort to develop semantic web standards for History or Digital Humanities, or on a smaller scale develop middle layer tools that will allow one domain to interoperate with another.

WHY IS THIS PROJECT NEEDED: The National Endowment for the Humanities funds the development of innovation in the digital humanities. Many of the projects state that they want to enhance the ability of researchers and scholars to interchange information. In many cases they create a new set of meta-data and definitions that enable workers in their particular field to share data. In the awards for last year the Alexandria Archive Institute was granted and award to develop a way to share data between archeological sites. They developed a set of metadata keywords and content.  The University of Indiana developed an ontology for philosophical thought. Duke University developed an extension to TEI which includes a set of rules for encoding Whitman poems. The problem with this approach is that there are so many standards they cannot work together in a collaborative, automated way. The term “Place” for an Archeologist has been superseded by the term “site.” However, a site exists at a “location,” and does not mean a web site where more information is available. However, in philosophical thought “place” is meant to be the city where something happened, and not the country where an archeological “site” was discovered. A human can easily see what is meant in each case, but a computer needs to know that site means one thing in one collection of data and something else in another collection of data. An effort to overcome these problems is the Semantic Web effort. The main thrust of this effort is in the government and business community, although the Smithsonian Institute is currently working on promoting the use of the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM) Conceptual Reference Model (CRM) which is an ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information.

Someone who wanted to create a new web site and make their research, artifacts, and analysis available as services to other digital humanists would not know where to start. If the site only contained documents the TEI would be a good place to start, but the web is so much more than the digitization of documents. This project is the first step in an effort to allow these many different standards to interoperate.

FEATURES/FUNCTIONALITY: The initial effort would be in the development of a presence which would provide links to all known XML and DTD standards for the Digital Humanities with emphasis on History. These links would include a short description of the information, status, domain space, and functionality of the site. Each entry would also provide a place for users to comment on their individual experience with the technology defined in the link. This link page would also include a method for users to nominate other web sites to be included in the list, along with the reason they feel the site should be included. There would be another list of technology tools that have been created to assist in the creation of XML which follow the specific schemas and rules of the domain. Again there would be a place where users could nominate other sites. Another section would be an overview of the project and proposed course of action to continue the development of this effort. This section would be a wiki which would have a main page of project plan and rationale.

In a completely separate section, focused on an entirely different audience, would be a set of easy “how-tos” or links to easy to understand information on the use of machine readable information on web pages. This would include links to the API primer on the ProfHacker web site, etc. Sort of a one stop shop for how to make your web site/museum collection/document repository discoverable and useable by automated programs. It would also bring out the advantages and disadvantages of using different tools to make sites. This area would also have an area for users to nominate sites, locations, books for others to use. In this area I foresee a grading system being employed shoing users what was most beneficial to other users.

AUDIENCE: There are two main audiences for this site. The first one is those people who are interested in promoting the free flow of information between sites that can be accomplished through the use and power of automated programs. These users include those organizations which have already spent considerable effort to develop their reference models and would like to have a different place to advertise their existence. For instance the CIDOC CRM has been over a decade in development. The George Mason Center for History and New Media (CHNM) provides a well known and respected location for people to come to participate in the development of a more universal definition.

The second audience is for people who want to create a web site and have their work be accessed and used by many others. Stand alone web sites with static data are limited in functionality. Researchers are always looking for ways to make their research more available and used and if they can easily enhance their web sites and data collections they would do so. This site would show them how to do that.

TECHNOLOGIES TO BE USED: This site would take advantage of basic html, wiki, and would of course make the information contained, accessible through standard APIs.

USER-CONTRIBUTED/INTERACTIVE ELEMENTS: The basic functionality of this site is the user interaction. Users input will be in all sections of the site. In some cases it will be the nomination of items, in others it will be interactive feedback on what was useful and what wasn’t. In order to provide some level of control over technology zealots all input would be moderated before being posted to the site.

October 14, 2009

I have to stop reading Dr. Cohen’s tweets

One thing leads to another and pretty soon I have all my project ideas shot down and have spent another evening/night surfing the web.

Today he posted that he was going to attend a workshop on API’s in digital humanities. Since my project ideas were set around API’s for maps/timelines etc. I thought I would go see what was happening at that conference. One thing led to another and it is now 2 hours later. For people who really want to know about API’s a short little series of posts that explain them and why they are important is at Part 1, Part 2, Part 3.

As I was following some of the API sites mentioned I came across one site that is very good in providing historical information, using collaboration with people to gather information, using Google Map API’s, and using Twitter. The site is about the diary of Samuel Pepys. It uses Pepys diary from the 17th century and updates daily what old Pepys is doing that day. For instance today you could read the diary entry  from Sunday Oct 14, 1666 as a blog entry. People can comment/contribute more information about the diary entry.  You can also sign up to get Tweets from Sammy boy. If you go into the Encyclopedia section you can click on a map which uses the Google Map API to show locations that Pepys talks about, information about his houses, where he worked etc. A very good site using many exciting features of the web and web 2.0.

This site uses API’s from other web providers to give an enhanced experience of reading Pepys. But it is missing something – Providing API’s for other sites.If this site provided an API which would allow others to access data based on things like people mentioned, location, date or any of the other things that are given as entries into the encyclopedia think what could be done. This API should be able to collect data from the diary, from the collaboration posts, from an integration with the Google API and current google maps. If this site provided an API with this type of information someone else could gather data and enhance the Flikr interface to not only show photos put in by users of this site, but using the Flikr API,  actively mine the data between the two sites and show current photos of places mentioned, historical photos etc. With a data discovery engine as new sites made London data discoverable the integrated site could continually provide new views of data and link them back to Pepys in the 17th century.

It seems only fair that if you use  API’s to enhance your site you should provide API’s of your data so other web sites/applications can use your data. This site has many contributors whose contributions are not easily found by other researchers because there is no API.

Well now I have to stop daydreaming about all the neat ways web sites can capture information from each other and get back to work on developing a project that hasn’t been done before and writing my historiography for my Britian in the 20th Century class.  I am now saving all my work to my hard drive, sending it to my google group account, and will make a CD of my work each week until the end of the semester. I WILL NOT LOSE MY WORK AGAIN!!!!

However, time lost because of curiosity about Twitter posts is gone forever.

October 13, 2009

I Hate Computers Part 2

The continuing saga of me vs Mac. Last month my disk drive went south. I took it into the shop and they reformatted and said everything was ok. I came back and restored everything from my TimeCapsule. It worked like a charm. Everything was there – did not miss a beat. On Saturday I sat down to do some work and the lovely question mark was on my screen. No disk drive again. Took the computer back in and this time they replaced the whole disk drive. I thought, this is no problem, I’ll just restore my disk drive from my TimeCapsule back up and I’ll be good as new. WRONG, WRONG, WRONG.

For all of you Mac users that use TimeCapsule. When you restore from a backup it does not restore your backup settings. As a matter of fact it turns TimeCapsule off. I did not check this setting. After 30 years in this business you would think I would learn my lesson. You would be wrong.

So my last backup was from September 19. Everything I have done since then is gone. I am such a happy camper. However, I have verified my backup is now working. I even restored from a backup I did yesterday to make sure the restore works correctly.

I think for my project I will make a device that comes out of your monitor and slaps you up side your head and tells you “REMEMBER YOUR BACKUPS” It is probably the most useful tool for New Media I can think of right now.

September 5, 2009

Project Ideas

I don’t really want to make a web site. I did that for Clio 2 and it wasn’t that good. What I would like to do is try to develop a tool that can be used by historians to either display information or use the web better. So I was brainstorming – people who know me say for me it shouldn’t be called “storming” its more like a light drizzle, but anyways here are some ideas to start off with

1)    Develop a “trust” mechanism for displaying data and sharing data within a Web 2.0 world. If web sites are supposed to be able to “consume” data from other sites there needs to be a way that one site can trust that the data they are receiving is from a reputable source. The governance structure is the hard part because that would entail setting up a consortium of Univeristies/scholars/publishers etc to review authors/submissions and issue them keys to digitally sign their work. I am definitely not proposing developing that framework for the Internet, that would probably be over the grant money available, in fact that would probably be over the stimulus package. What I propose is the development of an architecture, using a Public Key Infrastructure which could be used by a publisher or a college to validate their data contributors. If this catches on the issue of “trust” between domains could be tackled.

2)    Develop a concept/tool that will attempt to capture data off of digitized information on a person or place on the web and use open-source tools to parse that data (Data Mining of Historical Information). The Intelligence Community captures data from all different sources and when it is all correlated it yields startling connections not envisioned when just using straight line thinking. Probably the most nebulous idea, but hey it’s the first week in class.

3)    Development of Standards for the interchange of temporal information.  There is a standard for determining where in space an object is by giving it it’s geoloc. This is Latitude and Longitude according to a specific map representation of the earth. This geoloc can be translated into any other map representation. I am not aware of the same thing being available for time representation.  When you view a date you never know whether it is in Julian data, Gregorian date, Jewish calendar, etc. This grant would be to develop a proposed standard for temporal data within the history profession for web interchange of data.

4)    Development of XML tag set for historical information. Currently, if you want to exchange information between two map or graphical representation of space you would probably put it into a .kml file. If you were in the public safety or Justice community you would put your data into a specific XML hierarchy called GJXDM. There are standards for passing data between all different communities. What I would propose is the development of a specific standard for passing data concerning one particular type of historical event. An easy one would be a Battle. The grant would be for the development of a draft standard, the development of the XML syntax checker and an interface that would allow people to enter data about a battle and spit out the properly formatted XML stream.

5)    Development of Historical Viewers – We know we can place data on a map where it happened. We can also place data on a timeline when it happened. Wouldn’t it be neat to be able to combine those two ideas together and allow you to use a control to see historical events shown in geospatial output where you can change both the location and time individually and simultaneously. – I don’t know how it could be used but it sounds interesting.

6)    Develop a tool to show relationships/criticality/user-defined parameters of historical entities. This would work like the Library Thing book cloud, or maybe the thesaurus link tool that would allow you to see maybe historical figures sized by the number of web sites about them, or the size of the world they influenced. Or maybe familial links between historical figures that dynamically changes as you traverse the links on the screen.

By the time of the next class I may have thrown all of these ideas out the door. Thoughts?

