Careers | Phone Book | A - Z Index

Current Projects

APM - Advanced Performance Model

To improve the efficiency of resource utilization and scheduling of scientific data transfers on high-speed networks, we started a project on Advanced Performance Modeling with combined passive and active monitoring (APM) that investigates and models a general-purpose, reusable and expandable network performance estimation framework. The predictive estimation model and the framework will be helpful in optimizing the performance and utilization of fast networks as well as sharing resources with predictable performance for scientific collaborations, especially in data intensive applications. Read More »

ADAPT: Adaptive Data access and Policy-driven Transfers

Large-scale science applications are expected to generate exabytes of data over the next 5 to 10 years. With scientific data collected at unprecedented volumes and rates, the success of large scientific collaborations will require that they provide distributed data access with improved data access latencies and increased reliability to a large user community. To meet these requirements, scientific collaborations are increasingly replicating large datasets over high-speed networks to multiple sites. The main objective of this work is to develop and deploy a general-purpose data access framework for scientific collaborations that provides lightweight performance monitoring and estimation, fine- grained and adaptive data transfer management, and enforcement of site and VO policies for resource sharing. Lightweight mechanisms will collect monitoring information from data movement tools without putting extra loads on the shared resources. Read More »


Computational Infrastructure for Financial Technologies Read More »

SDAV: SciDAC Scalable Data Management, Analysis, and Visualization Institute

The SciDAC SDAV Institute will actively work with application teams to assist them in achieving breakthrough science and will provide technical solutions in the data management, analysis, and visualization regimes that are broadly applicable in the computational science community. As the scale of computation has exploded, the data produced by these simulations has increased in size, complexity, and richness by orders of magnitude, and this trend will continue. Users of scientific computing systems are faced with the daunting task of managing and analyzing their datasets for knowledge discovery, frequently using antiquated tools more appropriate for the teraflop era. While new techniques and tools are available that address these challenges, often application scientists are not aware of these tools, aren't familiar with the tools' use, or the tools are not installed at the appropriate facilities. SDAV will deploy, and assist scientists in using, technical solutions addressing challenges in three areas: • Data Management – infrastructure that captures the data models used in science codes, efficiently moves, indexes, and compresses this data, enables query of scientific datasets, and provides the underpinnings of in situ data analysis • Data Analysis – application-driven, architecture-aware techniques for performing in situ data analysis, filtering, and reduction to optimize downstream I/O and prepare for in-depth post-processing analysis and visualization • Data Visualization – exploratory visualization techniques that support understanding ensembles of results, methods of quantifying uncertainty, and identifying and understanding features in multi-scale, multi-physics datasets Read More »

FastBit: An Efficient Compressed Bitmap Index Technology

FastBit is an open-source data processing library providing searching functions supported by compressed bitmap indexes. It treats user data in the column-oriented manner similar to well-known database management systems such as Sybase IQ, MonetDB, and Vertica. The key technology underlying the FastBit software is a set of compressed bitmap indexes. In database systems, an index is a data structure to accelerate data accesses and reduce the query response time. Most of the commonly used indexes are variants of the B-tree, such as B+-tree and B*-tree. FastBit implements a set of alternative indexes called compressed bitmap indexes. Compared with B-tree variants, these indexes provide very efficient searching and retrieval operations, but are somewhat slower to update after a modification of an individual record. Read More »

ICEE: International Collaboration Framework for Extreme Scale Experiments

Large-scale scientific exploration in domains such as high-energy physics, fusion, and climate are based on international collaborations. As these collaborations produce more and more data, the existing workflow management systems are hard pressed to keep pace. A necessary solution is to process, analyze, summarize and reduce the data before it reaches the relatively slow disk storage system, a process known as in transit processing (or in-flight analysis). We propose to dramatically increase the data handling capability of collaborative workflow systems by leveraging the popular in transit processing system known as ADIOS, and integrating this with FastBit to provide selective data accesses. These new features will contribute to a new collaborative system named ICEE that aims at significantly improving the data flow management for distributed workflows. Read More »