OR08 Publications

SOAPI - a flexible toolkit for implementing ingest and preservation workflows

Hedges, M. (2008) SOAPI - a flexible toolkit for implementing ingest and preservation workflows. In: Third International Conference on Open Repositories 2008, 1-4 April 2008, Southampton, United Kingdom.

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
[img] MS PowerPoint (Presentation)
PDF (Presentations) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


The digital material being held in repositories is increasing significantly not only in size but also in diversity of format and complexity of structure. This is particularly the case for repositories based on Fedora, with its highly flexible content model architecture. This data represents a major investment, and in many cases is irreplaceable; there is thus a pressing need for implementing preservation functionality in repositories to ensure long-term access, and for scalability reasons this must be as automated as possible. The SOAPI project is addressing this need by developing a flexible toolkit for implementing automated and semi-automated workflows supporting the preservation of complex digital material in repositories. Our approach is not to develop a monolithic tool, but a set of modular web services, each encapsulating a well-defined unit of preservation functionality, which can be configured and combined to produce workflows. The workflows are implemented using the jBPM workflow engine, and generic interfaces are created for particular categories of action to facilitate the simple plug-and-play of specialised third part software, e.g. for file format conversion and metadata extraction. Workflows can thus be created for, e.g., (i) ingesting compound digital objects, checking object integrity, generating metadata, normalising files to suitable preservation formats, and outputting Archival Information Packages in the form of METS (or FOXML) documents, or (ii) post-ingest preservation activities such as event-driven workflows that verify the fixity of digital objects, check for format obsolescence, and take remedial action if necessary.

Item Type:Conference or Workshop Item (Paper)
Creators:Mark Hedges
Subjects:User Groups > Fedora User Group > Preservation and management
ID Code:97
Deposited By: Leslie Carr
Deposited On:09 Apr 2008 07:22
Last Modified:26 Oct 2011 16:08


Repository Staff Only: item control page

JISC/CNI Meeting 'Transforming the User Experience' July 10-11 2008

Microsoft eScience Workshop at Indiana University, December 7-9, 2008, Accelerating Time to Scientific Discovery

Open Repositories 2009 Atlanta, Georgia. May 18-21 2009