Thursday, August 13, 2009

metaphors

My colleague Thorny Staples often uses the metaphor that digital humanities projects are, at their most basic level, online exhibitions. Curated content is presented with key descriptive information not unlike exhibition tombstone labels and contextualized through categorization and by scholarly essays of varying lengths as well as site information architecture (not unlike rooms of an exhibition with wall texts). The end results include the identification and explication of relationships and the presentation of deep readings of objects. That metaphor always resonated with me.

In a recent discussion a small group was trying to work out some generalized models to for the processes we follow from the receipt/creation of digital files through to providing access. We were having a particularly lengthy discussion about description and contextualization -- at what point in a digital file's life cycle is it related to other files and identified as a digital object, and at what point is some sort of intellectual meaning overlaid onto that digital object?

My new colleague Terry Harrison -- a big fan of using metaphors -- commented that when museums acquire objects they cannot know every context in which the object will be exhibited or published in the future, but they acquire it and put effort into description and conservation to prepare for future display/publication when the object will be contextualized many times over.

This sent me down the road to a metaphor that's still developing in my head which may not yet translate to something that anyone beside me thinks is sensible. Or it may not be sensible at all.

First, I'm starting with an assumption that there are four very broad categories of activities that we need to describe (leaving out "preservation" for now). On the museum side, it's these:

Acquisition: Items are proposed, selected, and acquired
Accessioning: Items have accession numbers assigned, are assigned storage locations, relationships between parts are identified (a tea set is made up of individual components), and basic descriptive information is recorded in a registration system
Preparation: Items are cleaned, repaired, mounted, framed, or otherwise stabilized and made ready for research use and public viewing
Exhibition: Items are further described and presented in the context identified by a collection or exhibition curator; an object will be exhibited many times and assigned to multiple contexts

This roughly translates to this in the digital realm:

Creation/Transfer: Selection and digitization or transfer of digital (master?) files to an institution
Inventory: Files are assigned identifiers/names, placed into some sort of meaningful (or not) storage location in a server environment
Processing: QA, manipulation, derivative creation
Access: Making content discoverable and usable, which can include a curator providing context and intellectual overlays for objects (not files)

I'm having one real issue in making this metaphor work for me and for others, and that's around the creation of metadata and recording of file relationships. At what point is the relationship of files to each other recorded? Is the creation of metadata identifying/describing an intellectual object part of inventory, processing, or access? When is the relationship of files to that intellectual object recorded?

I think that inventorying should include a step whereby the relationships between files are recorded so it is recognizable that some set of 300 files go together. There wasn't a lot of push back on this in our discussion. When descriptive metadata for an intellectual object is created and when the relationship of files to an intellectual object are recorded engendered a lot of discussion. I personally think that descriptive metadata for intellectual objects represented by those files is also created during the inventory stage, and that files in hand at that stage should in some way be associated with the intellectual objects at that time.

This is complicated because the recording of all the relationship of files to intellectual objects is not fully possible until objects are prepared and added to an access application. That's where the contextualization happens, so one can argue that that is where intellectual objects are truly defined and the process of associating files to objects takes place. Preparation is driven by access. If access applications are siloed at all, each might use different derivative files, and there has to be some association of those derivatives to the master and to the intellectual objects.

So, we have master files, derivative files (possibly multiple sets over time per access point), intellectual object metadata, relationships of all files to each other and to that intellectual object, and the need to inventory and manage all of the above. Which may be separate from an access application or multiple access points. Where is this recorded, in what order, where, and how do we describe these activities? I'm struggling with that part of the metaphor/model.

How did this conversation arise? Well, we're trying to scope out some future directions and activities, and a shared understanding of the model for the activities we support is vital. Mine is not the only model proposed and it just may not be right. I'm sharing this as much for my own process as anything else.