Sunday, November 16, 2008

university of texas repo atom use session at dlf

My somewhat unstructured notes from a presentation by Peter Keane from UT Austin on the use of Atom and Atom/Pub in their DASe repository, at at the DLF Fall 2008 Forum.

  • DASe project: lightweight repository, 100+ collections, 1.2 million files, 3 million metadata records.
  • DASe has replaced their image reserves system. Home grown (“built instead of borrowed”), originaly prototyped 2004/2005.
  • They didn’t originally plan to build a repository, they were building an image slideshow and ended up with a repository, too.
  • It’s a data first application. Data comes from spreadsheets, FM, Flickr, iPhoto, file headers, etc. System includes a variety of different collection-based data models. Needed to map to/from standard schemas. Accepted as is, no normalization or enrichment at all.
  • SynOA: Syndicated Oriented Architecture. Importance recognized in being RESTful. DASe is a Rest framework.
  • Use the Atom publishing protocol to represent collections and items and searches. Used internally between services, including upload and ingest (uses http get, post, etc). Everything is Atom with a UI (Smarty PHP templates) on top of it.
  • Working on a Blackboard integration.
  • Interesting use of Google spreadsheets – create Google spreadsheet for whatever they have a name/value pairs, automatically outputs atom, can ingest from feed.
  • No fielded search across collections, only within a single collection. They could map across data models to a common standard, but haven’t. (corrected as per comment below)
  • Repositories were considered a door to libraries, all trying to create a better door. This is not the right concept, instead should be exposing in a standard way to any and all services.
  • Loves REST; used the term “RESTafarian.”

1 comment:

pkeane said...

Hi Leslie-

I think you summarized the main points just right. One correction, though (and I suspect I didn't make this clear enough): you can search across collections -- it's the most common method of searching. But the searches are not "fielded." I.e., a search for "oil painting" will return any item with the terms "oil" and "painting" in any metadata field, in any collection. Since precision is less important than recall in this system (it's not so big to produce lots of "noise"), this works fine. Searching w/in a particular field is generally available only when searching in a single collection (and is seldom used).

--Peter Keane, Univ of Texas at Austin