Keeping checkpoint/restart viable for exascale systems. (open access)

Keeping checkpoint/restart viable for exascale systems.

Next-generation exascale systems, those capable of performing a quintillion (10{sup 18}) operations per second, are expected to be delivered in the next 8-10 years. These systems, which will be 1,000 times faster than current systems, will be of unprecedented scale. As these systems continue to grow in size, faults will become increasingly common, even over the course of small calculations. Therefore, issues such as fault tolerance and reliability will limit application scalability. Current techniques to ensure progress across faults like checkpoint/restart, the dominant fault tolerance mechanism for the last 25 years, are increasingly problematic at the scales of future systems due to their excessive overheads. In this work, we evaluate a number of techniques to decrease the overhead of checkpoint/restart and keep this method viable for future exascale systems. More specifically, this work evaluates state-machine replication to dramatically increase the checkpoint interval (the time between successive checkpoint) and hash-based, probabilistic incremental checkpointing using graphics processing units to decrease the checkpoint commit time (the time to save one checkpoint). Using a combination of empirical analysis, modeling, and simulation, we study the costs and benefits of these approaches on a wide range of parameters. These results, which cover of number of high-performance …
Date: September 1, 2011
Creator: Riesen, Rolf E.; Bridges, Patrick G. (IBM Research, Ireland, Mulhuddart, Dublin); Stearley, Jon R.; Laros, James H., III; Oldfield, Ron A.; Arnold, Dorian (University of New Mexico, Albuquerque, NM) et al.
Object Type: Report
System: The UNT Digital Library
Workshop on the role of natural analogs in geologic disposal of high-level nuclear waste: Proceedings (open access)

Workshop on the role of natural analogs in geologic disposal of high-level nuclear waste: Proceedings

A Workshop on the Role of Natural Analogs in Geologic Disposal of High-Level Nuclear Waste was held in San Antonio, Texas on July 22--25, 1991. The proceedings comprise seventeen papers submitted by participants at the workshop. A series of papers addresses the relation of natural analog studies to the regulation, performance assessment, and licensing of a geologic repository. Applications of reasoning by analogy are illustrated in papers on the role of natural analogs in studies of earthquakes, petroleum, and mineral exploration. A summary is provided of a recently completed, internationally coordinated natural analog study at Pocos de Caldas, Brazil. Papers also cover problems and applications of natural analog studies in four technical areas of nuclear waste management-. waste form and waste package, near-field processes and environment, far-field processes and environment, and volcanism and tectonics. Summaries of working group deliberations in these four technical areas provide reviews and proposals for natural analog applications. Individual papers have been cataloged separately.
Date: September 1, 1995
Creator: Kovach, L.A. & Murphy, W.M.
Object Type: Article
System: The UNT Digital Library