Resource Type

Hurricane Florence Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to Hurricane Florence and the subsequent flooding along the Carolina coastal region. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 4,971,575 Tweets and 347,205 media files make up the combined dataset.
Date: 2018-09-05/2018-10-03
Creator: Phillips, Mark Edward
System: The UNT Digital Library

2018 Texas Sentate Debate Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to the United States Senate race between Beto O'Rourke and Ted Cruz. This dataset contains Tweets captured around their first debate on September 21, 2018. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 3,006,198 Tweets and 101,050 media files make up the combined dataset.
Date: 2018-09-12/2018-10-03
Creator: Phillips, Mark Edward
System: The UNT Digital Library

Unlabeled PDF Dataset of Technical Reports USDA.gov domain in the EOT 2008 Web Archive

This dataset contains a sample of 10,000 PDF documents from the usda.gov domain in the End of Term (EOT) 2008 Web Archive. These samples are unlabeled and uncategorized.
Date: September 12, 2018
Creator: Phillips, Mark Edward
System: The UNT Digital Library

The Portal to Texas History's Texas State Publications Collection Dataset

This dataset contains a set of 2,448 PDF files from the Texas State Publications collection in The Portal to Texas History.
Date: September 12, 2018
Creator: Phillips, Mark Edward
System: The UNT Digital Library

UNT Scholarly Works PDF Dataset

This dataset contains a set of 4,534 PDF files from the UNT Scholarly Works collection, the institutional repository for UNT in the UNT Digital Library.
Date: September 12, 2018
Creator: Phillips, Mark Edward
System: The UNT Digital Library