Matt Kirschembaum has an interesting article in the Chronicle of Higher Education -- “Hamlet.doc? Literature in a Digital Age.”

Dan Cohen comments on Matt's observations on how technology such as change tracking creates new possibilities for understanding the creative process, and how important standards will become. Another part of the article resonated even more with me:

The implications here extend beyond scholarship to a need to reformulate our understanding of what becomes part of the public cultural record. If an author donates her laptop to a library, what are the boundaries of the collection? Old e-mail messages, financial records, Web-browser history files? Overwritten or erased data that is still recoverable from the hard drive? Since computers are now ground zero for so many aspects of our daily lives, the boundaries between our creative endeavors and more mundane activities are not nearly as clear as they might once have been in a traditional set of author's "papers." Indeed, what are the boundaries of authorship itself in an era of blogs, wikis, instant messaging, and e-mail? Is an author's blog part of her papers? What about a chat transcript or an instant message stored on a cellphone? What about a character or avatar the author has created for an online game? The question is analogous to Foucault's famous provocation about whether Nietzsche's laundry list ought to be considered part of his complete works, but the difference is not only in the extreme volume and proliferation of data but also in the relentless way in which everything on a computer operating system is indexed, stamped, quantified, and objectified.
I remember the discussion of boundaries when we first started talking about archiving web sites. Where does a web site "end" when it has linkages to other sites? Within the same subdomain? Within the same domain? Do you include the pages that are linked to in other sites because they might provide important context?

Many years ago I served on the board of directors of a professional organization. As part of the organizational archive, I was asked to supply my print files, electronic documents, and my email archives when my service ended. At the time I was an obsessive file archiver and I could supply all my email from four different email addresses and two different environments (Compuserve and Eudora) as well as many snapshots of document versions and web sites over a seven year period. But those were official versions. Would I want every awkward draft of a report or a brochure saved for posterity? Is that really part of the organizations' history?

While I think a lot about privacy and what an author might/should restrict access to (short-term or long-term) when leaving behind their digital legacy, there is so much potential for research. How does working on digital financial records differ from studying account ledgers? How does studying email differ from studying written correspondence or memoranda? Or blogs versus published editorials? They're the same research activities, just different media. Again, from Matt's article:
The wholesale migration of literature to a born-digital state places our collective literary and cultural heritage at real risk. But for every problem that electronic documents create — problems for preservation, problems for access, problems for cataloging and classification and discovery and delivery — there are equal, and potentially enormous, opportunities. What if we could use machine-learning algorithms to sift through vast textual archives and draw our attention to a portion of a manuscript manifesting an especially rich and unusual pattern of activity, the multiple layers of revision captured in different versions of the file creating a three-dimensional portrait of the writing process? What if these revisions could in turn be correlated with the content of a Web site that someone in the author's MySpace network had blogged?
Yes, there are definitely issues in accessing file formats as they age. When I rediscovered my single-sided original Mac disks from the mid-80s with my MA research and thesis written in MacWrite 1.0, or 5 1/4" disks with documentation that I wrote in 1991 in WordPerfect, I had to call in favors from folks with vintage Mac and PC hardware and buy Conversions Plus software to get at the file content (not fully successfully). I was incredibly lucky that the media could be read, let alone transform the files. Let us not even speak of the versions of files over time that I lost on on Mac ZIP disks that were accidentally discarded in a move. There went part of the history of the organization that I mentioned above.

There is a lot of education needed about preserving digital output and the file and media standards to be used. I look forward to seeing the work of Maryland's X-Lit project.

