Wednesday, December 27, 2006

5 things you don't know about me

I don't often do memes, but ...
1. My first library job was in the 4th and 5th grades. I had to pass a test on alphabetization and the top level Dewey Decimal classes to work as a shelver and at the circ desk at my elementary school library.
2. During my freshman and sophomore years of college I worked at a Baskin Robbins in Los Angeles. Among my duties were cake decorating and making ice cream cakes and pies. I could still make a grasshopper pie if asked.
3. The previous item wouldn't be peculiar if it weren't for the fact that I'm lactose intolerant.
4. I trained as an archaeologist in graduate school and worked for museums for many years before moving into Library work. All my work was focused on digital collections and automation so my transition to digital libraries is not so odd. My first job while in graduate school was a recon project to transcribe written acquisitions records into a database, and to create digital images of a major Moche pottery collection. In that previous life I also spent seven years on the board of the Museum Computer Network.
5. I collect Mexican folk art. Not in a systematic way, but when I see things that I really like it's hard to stop myself from buying them. I'm looking forward to hitting some galleries while in San Antonio for the Open Repositories 2007 conference.

Thursday, December 21, 2006

new for uva dl

For years I've been forwarding notices on articles, reports, sites, and what-not to to various lists and groups at the Library. My colleague Cyril pointed out the obvious to me the other day when he and Ronda was presenting a session for library staff on -- that while the email messages were useful, an annotated and tagged set would be even more useful (and more persistent than messages in folks' inboxes).

Given that it's the week before Christmas and work was winding down for the break, I went through three years of outgoing messages to particular internal email lists (yes, I'm a compulsive email hoarder) and created a set:

I have a lot more to add, it needs some work on the tags, and I haven't created any bundles or set up any networks yet, but it's a start at pulling together things that I find useful and think my colleagues should know about.

Monday, December 18, 2006


I've spent way too much of my day exploring OCLC's FictionFinder prototype. Read more about the project.

The subject tag cloud that you encounter when first entering the system intrigues me -- the most commonly used subject appears to be "Marriage." I followed the subject "Missing children," vaguely thinking that I might encounter From the Mixed-up Files of Mrs. Basil E. Frankweiler. Nope. I searched for it and found that it's actually the subject "Runaway children." I wonder if I could have found it without knowing the title or that subject term? How do you know what subject term is the right one when browsing or searching? When is something under "Quakers" and when is it under "Society of Friends"?

I think that defaulting to genre for browsing is the right choice. It's a manageable length for browsing (at least for now), while the "subjects" list is quite long and "characters" is huge. See Thom Hickey's post on the difficulty in creating that character list. The awards list frustrated me momentarily -- I had to remember that the "Edgar" awards are actually the Mystery Writers of America awards and look under M.

The "settings" browse list results could be frustrating for some. I clicked on Mexico and found books where the subject was actually "New Mexico." It was great to see that books where the subject was "New Mexico -- Santa Fe" showed up under "New Mexico."

I searched on "voodoo," which is not the official subject terms (It's voodooism, if you care to know). I got books where voodooism is a subject. I got books where voodoo is in the title. And I got books where voodoo is in the description, such as "By the author of Voodoo, Ltd." I know it's a tough problem to index the assigned terms and other fields where relevant subject topics might be found.

The FRBRization display for a work is a sensible one. Who knew that Gaston Laroux's Fantome de l'Opera was available in Thai? Nice to see that I could follow the edition into WorldCat to see that Cornell owns it.

I did come across many examples of works that should have been identified as the same but were not for a single author -- H. P. Lovecraft. When I had the same experience in LibraryThing some months ago, I spent some time cleaning up the work relationships. I wonder why his works are difficult to identify and combine programmatically?

How do I combine browse types? I'm looking for books set in England that feature ghosts. There's an advanced search but no advanced browse. Maybe something like at Amazon, where one can narrow within facets: jewelery --> rings --> gold --> emerald.

I don't want this to sound like a rant, because it isn't. I think this is a really promising prototype, both as a FRBR experiment and as a subject browse environment. The fact that you can generate the browse lists at all is exciting.

current issue of D-Lib

The December issue of D-Lib has two articles in particular that I found very worth my time.

The first is David Bearman's review of Jean-Noël Jeanneney's Google and the Myth of Universal Knowledge: A View from Europe. I've known David almost twenty years and I always find his issue pieces thoughtful.

The second is a very interesting article on the proposed draft audit checklist for repositories and OAIS. The Audit Checklist is still a draft after maybe 2 years. The outcome presented in this article that even after annotating the checklist for use in an NDIIPP project, there were still issues in scoring the results and interpreting them.

We began this process by annotating the Audit Checklist and enlisting our team members to gauge their software installation experiences against it. Currently we are concluding a series of meetings to reach a consensus on the interpretation of checklist items. Using a test example scenario, we also experimented with applying an existing scoring instrument to the annotated Audit Checklist. This was an exercise that clarified the need for a more meticulous refinement of our annotated Audit Checklist, one that should be undertaken with the developers of the common repository software applications. Our experience thus far suggests that the application of 'weights' to the Audit Checklist items, specifically according to an institution's own needs and priorities, may also provide a framework for guiding a reiterative self-assessment process of an institution's repository services. Aside from this, as more institutions explore the possibility of providing trustworthy digital repository services, the evaluation of repository software applications increasingly will necessitate a more extensive, community-based expression of technical functional specifications needed to support the requirements of Trusted Digital Repositories. With an ever increasing array of potential software tools, services, and infrastructure configurations, the time is ripe for an evaluative approach to repository software that considers the array of items found in the Audit Checklist.
Array is right. The checklist has 86 items and four possible scores for each. This instrument is challenging to use and exceptionally experienced and well-qualified people still have issues in agreeing how to score it. Such a tool is definitely needed -- why is it so hard to design one?