Degree Department

After the Harvest: Preservation, Access, and Researcher Services for the 2016 End of Term Archive

This presentation discusses the End of Term Archive and methods for identifying and selecting in-scope content (including using registries, indices, and crowdsourcing URL nominations ["seeds"] through a web application called the URL Nomination Tool), new strategies for capturing web content (including crawling, browser rendering, and social media tools), access models including both an online portal as well as research datasets for use in computational analysis, and preservation data replication between partners using new export APIs and experimental tools developed as part of the IMLS-funded WASAPI project.
Date: December 13, 2016
Creator: Bailey, Jefferson; Grotke, Abigail & Phillips, Mark Edward
System: The UNT Digital Library

Extracting "Documents" from Web Archives

This presentation describes the process of using machine-learning algorithms to identify and extract publications from web archives.
Date: December 13, 2018
Creator: Phillips, Mark Edward
System: The UNT Digital Library