Statistical Modeling of Large-Scale Scientific Simulation Data (open access)

Statistical Modeling of Large-Scale Scientific Simulation Data

With the advent of massively parallel computer systems, scientists are now able to simulate complex phenomena (e.g., explosions of a stars). Such scientific simulations typically generate large-scale data sets over the spatio-temporal space. Unfortunately, the sheer sizes of the generated data sets make efficient exploration of them impossible. Constructing queriable statistical models is an essential step in helping scientists glean new insight from their computer simulations. We define queriable statistical models to be descriptive statistics that (1) summarize and describe the data within a user-defined modeling error, and (2) are able to answer complex range-based queries over the spatiotemporal dimensions. In this chapter, we describe systems that build queriable statistical models for large-scale scientific simulation data sets. In particular, we present our Ad-hoc Queries for Simulation (AQSim) infrastructure, which reduces the data storage requirements and query access times by (1) creating and storing queriable statistical models of the data at multiple resolutions, and (2) evaluating queries on these models of the data instead of the entire data set. Within AQSim, we focus on three simple but effective statistical modeling techniques. AQSim's first modeling technique (called univariate mean modeler) computes the ''true'' (unbiased) mean of systematic partitions of the data. AQSim's …
Date: November 15, 2003
Creator: Eliassi-Rad, T.; Baldwin, C.; Abdulla, G. & Critchlow, T.
System: The UNT Digital Library