Language

ERCOT/2021 Texas Power Crisis Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to the This dataseic Reliability Countil of Texas (ERCOT) during the 2021 Texas power crisis from February 10th, thru February 27th, 2021. The dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 612,082 Tweets make up the combined dataset.
Date: 2021-02-09/2021-02-24
Creator: Phillips, Mark Edward
System: The UNT Digital Library

#DiaperDon Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to the hashtag #DiaperDon. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 866,987 Tweets make up the combined dataset.
Date: 2020-11-18/2020-12-01
Creator: Phillips, Mark Edward
System: The UNT Digital Library

Hurricane Ida Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to Hurricane Ida which was a deadly and distructive Category 4 Atlantic hurricane that made landfall in Lousiana in 2021. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 1,868,703 Tweets make up the combined dataset.
Date: 2021-08-20/2021-09-22
Creator: Phillips, Mark Edward
System: The UNT Digital Library

One Million Pages of Texas Newspapers: Dataset

This dataset represents the first million pages of Texas newspapers added to The Portal to Texas History as part of the Texas Digital Newspaper Program. The dataset consists of 123,184 newspaper issues from 569 titles, comprising 1,000,003 pages. Additionally the 3,349,156 item uses associated with this dataset as of April 7, 2013 are included.
Date: April 7, 2013
Creator: Phillips, Mark Edward & Hicks, William
System: The UNT Digital Library

Water Quality Corridor Management for Restoration (WQCM-R) Modeling Dataset

The dataset was developed to support research intended to develop a spatially-explicit model that prioritizes riparian areas in terms of potential for ecosystem restoration specifically to improve water quality downstream of the riparian area, and ultimately improve drinking water quality. The model was developed and then tested on the Lewisville Lake watershed (north central Texas, just north of Dallas, Texas, USA). The dataset contains environmental data for 90 sub-watersheds that form the overall Lewisville Lake watershed with a corresponding identification map.
Date: June 10, 2019
Creator: Atkinson, Samuel F.
System: The UNT Digital Library

Quality Assurance Practices in Web Archiving [Dataset]

This dataset contains the results of a survey of quality assurance practices within the field of web archiving and its practitioners. To understand current QA practices, the authors surveyed institutions engaged in web archiving, which included national libraries, colleges and universities, and museums and art libraries. The survey was administered online. It includes the completed responses of 54 participants. The data has been anonymized for privacy reasons. This dataset was used in the "Current Quality Assurance Practices in Web Archiving" paper, available from the UNT Digital Library.
Date: December 2014
Creator: Reyes Ayala, Brenda; Phillips, Mark Edward & Ko, Lauren
System: The UNT Digital Library

Tropical Storm Imelda Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to Tropical Storm Imelda and the subsequent flooding in the south Texas region. This dataset was created using the twarc (https://github.com/DocNow/twarc) package that makes use of Twitter's search API. A total of 76,420 Tweets and 4,429 media files make up the combined dataset.
Date: 2019-09-10/2019-09-21
Creator: Phillips, Mark Edward
System: The UNT Digital Library

Hurricane Dorian Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to Hurricane Dorian which is the most intense tropical cyclone on record to strike the Bahamas, and is regarded as the worst natural disaster in the country's history. This dataset was created using the twarc (https://github.com/DocNow/twarc) package that makes use of Twitter's search API. A total of 3,000,553 Tweets and 84,216 media files make up the combined dataset.
Date: 2019-08-25/2019-09-14
Creator: Phillips, Mark Edward
System: The UNT Digital Library

Meeting Science for Academic Librarians

This dataset contains results from survey of academic librarians about experiences in meetings and preferences for meeting components.
Date: August 10, 2020
Creator: Brannon, Sian & Leuzinger, Julie
System: The UNT Digital Library

Political Science Curriculum Map

This dataset provides a data analysis of how student learning objective from PSCI syllabi map to threshold concepts from the ACRL Framework for Information Literacy for Higher Education (2016) and the AAC&U Information Literacy Value Rubric (2013). The data includes non-core course for courses offered from the Fall 2017 semester to the Spring 2020 semester. This data analysis is conducted every three years. This curriculum map excludes core course previously as they were examined in the UNT Libraries Core Curriculum Map.
Date: May 11, 2020
Creator: Henson, Brea
System: The UNT Digital Library

[Response Data: Survey of Benchmarks in Metadata Quality]

Complete, anonymized dataset of responses to the Survey of Benchmarks in Metadata Quality. Date, time, IP addresses, and geographic data has been omitted. Responses that included project, organization, and/or repository names were removed from this data, as well as potentially identifying names, acronyms, and/or links.
Date: July 2019
Creator: Digital Library Federation. Assessment Interest Group. Metadata Working Group. Benchmarks Sub-Group.
System: The UNT Digital Library

Labeled PDF Dataset from Texas Records and Information Locator (TRAIL) Web Archive

This dataset contains a random sample of 2000 PDF documents from the Texas Records and Information Locator (TRAIL) Web Archive from the Texas State Library and Archives Commission. Each PDF has been sorted into two categories, TX_Pub_In_Scope and Not_TX_Pub.
Date: July 2018
Creator: Tarver, Hannah & Phillips, Mark Edward
System: The UNT Digital Library

The Portal to Texas History's Texas State Publications Collection Dataset

This dataset contains a set of 2,448 PDF files from the Texas State Publications collection in The Portal to Texas History.
Date: September 12, 2018
Creator: Phillips, Mark Edward
System: The UNT Digital Library

[Response Data: Improving Subjects in the Digital Collections with Data Survey]

Complete, anonymized dataset of responses to the "Improving Subjects in the Digital Collections with Data" survey. Date, time, IP addresses, and geographic data has been omitted.
Date: August 2021
Creator: Tarver, Hannah; Miles, Chassidy & Zipperer, Rachael
System: The UNT Digital Library

Hurricane Laura Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to Hurricane Laura that formed August 20, 2020 and dissipated August 29, 2020. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 1,168,178 Tweets make up the combined dataset.
Date: 2020-08-18/2020-09-02
Creator: Phillips, Mark Edward
System: The UNT Digital Library

Ruth Bader Ginsburg Remembrance Twitter Dataset

This dataset contains Twitter JSON data for Tweets related to the passing of Ruth Bader Ginsburg on September 18, 2020. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 4,195,270 Tweets make up the combined dataset.
Date: 2020-09-10/2020-10-04
Creator: Phillips, Mark Edward
System: The UNT Digital Library

Labeled PDF Dataset from UNT.edu

This dataset contains a random sample of 2000 PDF documents from the Spring 2017 Web Archive of the unt.edu domain. (https://digital.library.unt.edu/ark:/67531/metadc993363/) that have been sorted into two categories, ForRepo and NotForRepo.
Date: November 15, 2017
Creator: Andrews, Pamela & Phillips, Mark Edward
System: The UNT Digital Library

[U.S. Patent OCR Files: Disk USP001]

This dataset contains the compiled Optical Character Recognition (OCR) text files for the content of patent grants issued by the United States Patent Office from 1 to 469,664 (inclusive).
Date: 2013
Creator: Phillips, Mark Edward
System: The UNT Digital Library

UNT Scholarly Works PDF Dataset

This dataset contains a set of 4,534 PDF files from the UNT Scholarly Works collection, the institutional repository for UNT in the UNT Digital Library.
Date: September 12, 2018
Creator: Phillips, Mark Edward
System: The UNT Digital Library