Monday, April 27, 2009

Digital Karnak

I am a huge fan of 3-D visualizations of archaeological sites, and there's a new one developed by a team under Diane Favro and Willeke Wendrich at UCLA. Digital Karnak provides a Google Earth visualization of the site of Karnak, a massive temple complex in Egypt that was in use for some 1,500 years. There's a nice interactive timeline through which you can view the development of the site over time. Start with the overview if you're unfamiliar with Karnak.

The web site includes an amazing archive consisting of stills from the 3-D model and photographs from the archaeological site. I'd like to see that expanded some day to include any smaller objects from Karnak that are in various cultural heritage collections. Historical renderings (there are known drawings from the early 18th century onwards) would also be a nice addition.

There's a nice article in the Chronicle of Higher Education.

Tuesday, April 21, 2009

World Digital Library Launch

The World Digital Library is now available.

The site is launching with 1,170 objects from 26 partner institutions. WDL focuses on significant primary materials reflecting the cultural heritage of all UNESCO member countries, including manuscripts, maps, rare books, recordings, films, prints, photographs, architectural drawings, and other types of primary sources from varying time periods. The project will continue to add content to the site, and will enlist new partners from the widest possible range of institutions and countries.

The site is available in seven different languages: Arabic, Chinese, English, French, Russian, Spanish, and Portuguese. The content is not translated -- the items appear in their original language. The metadata and all the site navigation is translated to make it possible to search and browse the site in any of the languages. The metadata came from partner institutions or was created by catalogers at the Library of Congress, and much of the translation was provided by Lingotek.

The site was built using the Django Python framework, nginx, Lucene/Solr, and a mySQL database. The zooming in the imageviewer and pageturner is Seadragon Ajax. There is heavy use of Javascript, jquery, JSON and underlying XML. Check out the image carousels and timeline tool! The project also developed a cataloging tool to manage the metadata and cataloging process and interact with the Lingotek translation system via their API.

Sunday, April 12, 2009

museum data exchange software

OCLC, funded by the Mellon Foundation and working with the software company Cognitive Applications, Inc, has released COBOAT and OAICat Museum to support data interchange between museums. This work is happening under the auspices of their Museum Data Exchange Project.

So what, many people will say? It should already be easy to share museums data, right?

Not so much.

The museum collection management system arena has some major vendors (Gallery Systems, Willoughby, Minisis, Cuadra, etc) and some smaller vendors (Re:discovery, PastPerfect, etc.), and countless (and I really mean countless) home-grown systems running on FileMaker, Access, and MS-SQL. I know, because I spent many years working for museums and I was on the board of the Museum Computer Network, a group that dilligently worked on many interchange initiatives. I worked with software from 3 vendors and managed a FileMaker-based system. Getting data in was easy. Getting data out was often hard. Participation in data aggregation projects took a lot of effort. And most small- or medium-sized museums (and there are many, many more of them than large museums) have little or no technology staff to enable data sharing. And there is no common data schema in the community.

The museum community itself has sometimes slowed progress. When discussion of relevant library community standards were mentioned, some said "We're nothing like libaries! Our collections are unique! Their standards are not for us!" That attitude seems to have adapted in the last 10 years.

I am glad to see something like this going forward. A fee-free tool that can help museums extract data from black-box vendor systems and enable sharing? Bring it on.

Friday, April 10, 2009

open repositories 2009

The abstracts are now available for the presentation and poster sessions at OR09. This is one of my favorite conferences to attend and present at.

Sunday, April 05, 2009

DigCCurr 2009

I was in Chapel Hill the first week of April for the DigCCurr 2009 conference and to attend a meeting to brainstorm about personal digital collection preservation. I thought the conference was very good, better than the first one in 2007. I saw many excellent presentations, had some great conversations, and got a good response to my presentation on LC's work with file transfer and inventory tools. As with the last conference, I walked out thinking that I should have been an archivist.

I strongly recommend the proceeding form DigCCurr 2009. They're available as a free download from Lulu, or you can buy a POD version. You can also look up the very active twittering history at #digccurr.

I found it strangely hard to write up my notes from this meeting. I think it's because I'm still struggling with some aspects of the digital preservation problem space.

I absolutely agree that the activities of traditional archival practice have a place in the preservation of digital records. Where I found myself disagreeing with some presenters is in the balance between collecting and saving what we can versus an appraisal process to select what we will collect/save. In collection development practices for general collections, there is the often-held discussion about never knowing what might prove useful in the future, so it is a disservice to be too selective now. I guess that I have taken that point of view to heart, and I want to see our institutions cast as open a net as possible for digital collections. If we don't grab it when we can, there will be nothing to select.

I also found myself bristling occasionally over the implied scope of the term "digital collections" as I most often heard that phrase used at the meeting. There was very much a focus on electronic records and the digital realm of personal papers. Of course there were some great discussions around multimedia, web sites, audio/video, and image collections, but what I pretty much never heard anybody mention was born-digital scholarship and teaching and learning materials.

My first web site preservation project was at the Harvard Design School in the late 1990s, where, while developing courseware software, I realized that we were losing the history of what we taught and the products of the courses as we overwrote sites every term. Part of an institution's records are its lists of course offerings, course syllabi and reading lists, and, for some courses, the projects that the students created and put online in the course site. This was particularly true at at graduate school with programs in architecture, landscape architecture, and urban planning where the studio courses produced important site-specific work and case studies that was often lost after every term. I felt so strongly about this that I launched a course site preservation project that would have involved retrieving sites off server archives. We were looking at using METS (in its early days) to map the sites. But, as often happens, I ended up leaving before the project got very far along and no one felt nearly as devoted to the project as I did and it didn't go very far.

At UVA we launched a project called "Sustaining Digital Scholarship" to preserve born-digital scholarship, primarily in the humanities and social sciences. We instituted a technical assessment process and were working on documenting and migrating some major digital scholarly resources with varying strategies. That project is still going on in a limited way. It can take a lot of resources to assess and document a large digital archive.

That said, I was excited by some of the tools that I saw. ACE from the University of Maryland. MOPSEUS from Greece. The PARSE.Insight draft preservation roadmap. CASPAR for representation information. PLATO and Hoppla from Austria. LANL's ReMember Framework for OAI-ORE. CDL's Pairtree directory structure. Prometheus and MediaPedia from Australia. All very much worth looking into.

There was also a thread in this meeting on the use of digital forensics, transitioning some tools and practices from legal digital forensics into archival digital forensics. This interested me very much and I intend to read up in this area.

Thursday, April 02, 2009

new flip book beta

From Peter Brantley on the OCA blog -- A new beta version of the Flipbook bookreader has been released open source under GNU license. The source code is available from the Open Library site.

Wednesday, April 01, 2009

LC/CLIR report on pre-1972 sound recording copyright

Excerpted from the press release:

Sound recordings were not protected by federal copyright law until 1972. A Library of Congress report indicates that the miscellany of state laws protecting pre-1972 sound recordings will extend copyright protection until 2067, creating a situation where some recordings dating to the 19th century are not available in public domain.

The Library announced today the completion of a commissioned report that examines copyright issues associated with unpublished sound recordings. This new report from the Library of Congress and the Council on Library and Information Resources addresses the question of what libraries and archives are legally empowered to do, under current laws, to preserve and make accessible for research their holdings of unpublished sound recordings made before 1972.

The report, "Copyright and Related Issues Relevant to Digital Preservation and Dissemination of Unpublished Pre-1972 Sound Recordings by Libraries and Archives’ is one of a series of studies undertaken by the National Recording Preservation Board (NRPB), under the auspices of the Library of Congress. It was written by June Besek, executive director of the Kernochan Center for Law, Media and the Arts at Columbia University. The report is available free of charge at www.clir.org/pubs/abstract/pub144abst.html.