Thursday, September 27, 2007


I spent a little time this afternoon reading up on the newly released digital preservation tool Xena.

You can point it at a directory of diverse file types and it will convert the files into normalized open formats. The list of supported formats and the conversion outcomes is available in the help docs.

This is potentially a really useful workflow tool but there's a lot to examine here. I don't know how scriptable it is. You can write plugins to add in new formats -- I'm not yet sure if you can change conversion decisions and alter the target formats. Why is the target format for pretty much every image format PNG? Could we change that to TIFF or JPEG2000 if we were willing to write the plugin? It runs on Windows and Linux and requires OpenOffice. On Linux, does it require a graphical environment, or can you run it from the command line?

I'm thinking that this could be really useful for an IR, but I'm not yet sure if it will scale for Library-wide preservation or collection repositories.


Dorothea said...

I share your hopes and your doubts.

Tim Donohue wrote a gizmo to do similar work automagically in a DSpace repository, again with an OpenOffice dependency (though no GUI required). Worth looking into; Tim is a solid coder.

Leslie Johnston said...

Our preservation librarian plans to test it with a recently reformatted audio collection. I'll let you know how her test goes.