Resource Type

Month

3 Matching Results

Results open in a new window/tab.

Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies (open access)

Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the …
Date: March 23, 2010
Creator: Catfish Genome Consortium
System: The UNT Digital Library
AutomaDeD: Automata-Based Debugging for Dissimilar Parallel Tasks (open access)

AutomaDeD: Automata-Based Debugging for Dissimilar Parallel Tasks

Today's largest systems have over 100,000 cores, with million-core systems expected over the next few years. This growing scale makes debugging the applications that run on them a daunting challenge. Few debugging tools perform well at this scale and most provide an overload of information about the entire job. Developers need tools that quickly direct them to the root cause of the problem. This paper presents AutomaDeD, a tool that identifies which tasks of a large-scale application first manifest a bug at a specific code region at a specific point during program execution. AutomaDeD creates a statistical model of the application's control-flow and timing behavior that organizes tasks into groups and identifies deviations from normal execution, thus significantly reducing debugging effort. In addition to a case study in which AutomaDeD locates a bug that occurred during development of MVAPICH, we evaluate AutomaDeD on a range of bugs injected into the NAS parallel benchmarks. Our results demonstrate that detects the time period when a bug first manifested itself with 90% accuracy for stalls and hangs and 70% accuracy for interference faults. It identifies the subset of processes first affected by the fault with 80% accuracy and 70% accuracy, respectively and the …
Date: March 23, 2010
Creator: Bronevetsky, G; Laguna, I; Bagchi, S; de Supinski, B R; Ahn, D & Schulz, M
System: The UNT Digital Library
Statistical Fault Detection for Parallel Applications with AutomaDeD (open access)

Statistical Fault Detection for Parallel Applications with AutomaDeD

Today's largest systems have over 100,000 cores, with million-core systems expected over the next few years. The large component count means that these systems fail frequently and often in very complex ways, making them difficult to use and maintain. While prior work on fault detection and diagnosis has focused on faults that significantly reduce system functionality, the wide variety of failure modes in modern systems makes them likely to fail in complex ways that impair system performance but are difficult to detect and diagnose. This paper presents AutomaDeD, a statistical tool that models the timing behavior of each application task and tracks its behavior to identify any abnormalities. If any are observed, AutomaDeD can immediately detect them and report to the system administrator the task where the problem began. This identification of the fault's initial manifestation can provide administrators with valuable insight into the fault's root causes, making it significantly easier and cheaper for them to understand and repair it. Our experimental evaluation shows that AutomaDeD detects a wide range of faults immediately after they occur 80% of the time, with a low false-positive rate. Further, it identifies weaknesses of the current approach that motivate future research.
Date: March 23, 2010
Creator: Bronevetsky, G; Laguna, I; Bagchi, S; de Supinski, B R; Ahn, D & Schulz, M
System: The UNT Digital Library