Leveraging Machine Learning to Extract Content-Rich Publications from Web Archives

Presentation for the 2019 International Internet Preservation Consortium General Assembly and Web Archiving Conference. This presentation discusses research into leveraging machine learning to identify pdfs relevant to a collection from archived records.
Date: June 6, 2019
Creator: Phillips, Mark Edward; Caragea, Cornelia; Patel, Krutarth & Fox, Nathaniel T.
System: The UNT Digital Library