System For Preservation of Electronic Resources (SPER)


SPER is a part of the Digital Preservation Research project at Lister Hill Center’s Communications Engineering Branch. Its main objective is to help in the long term preservation of digitized or born-digital documents at the National Library of Medicine in a cost-effective way.

As a part of on-going research, SPER provides a testbed to explore and experiment with important digital preservation standards, tools and techniques. It also comprises a prototype system to perform actual preservation of digital documents in a convenient manner, using selected open source tools. An important component of SPER’s preservation function is the automated extraction of metadata from textual documents using machine learning tools, which significantly lowers the cost of metadata acquisition over manual input.

The following sections provide a description of the SPER preservation framework (also called SPER for simplicity), and its automated metadata extraction component.


