Careers | Phone Book | A - Z Index

Sifting Through a Trillion Electrons

CRDhpTrillion.jpg

SDM's Surendra Byna and colleagues from Berkeley Lab’s Computational Research Division teamed up with researchers to develop novel software strategies for storing, mining, and analyzing massive datasets generated by a state-of-the-art plasma physics code called VPIC. » Read More

Catching Turbulence in the Solar Wind

bz-322-full.jpg

Massive datasets plus modelling, visualization and analytics allow researchers to "see" the unseen: the turbulence in solar winds. » Read More

Arie Shoshani Earns Lifetime Achievement Award

Arie award

More than 25 years ago, Arie Shoshani realized that researchers were facing significant challenges in organizing, managing and analyzing their scientific data. He set out to develop computer applications to help them better meet the challenges and created the Scientific Data Management Group in the process. » Read More

The Scientific Data Management (SDM) group develops technologies and tools for efficient data access and storage management of massive scientific datasets. We are currently developing storage resource management tools, data querying technologies, in situ feature extraction algorithms, along with software platforms for exascale data. The group also works closely with application scientists to address their data processing challenges. These tools and application development activities are backed by active research efforts on novel algorithms for emerging hardware platforms.

Group Leader: John Wu

»Visit the Scientific Data Management (SDM) site.

SDM Publications

SLOPE: Structural Locality-aware Programming Model for Composing Array Data Analysis

June 16, 2019

DCA-IO: A Dynamic I/O Control Scheme for Parallel and Distributed File System

May 14, 2019

A new approach to multivariate network traffic analysis

March 30, 2019

Multidimensional Compression with Pattern Matching

March 26, 2019

Evaluating the Effects of Missing Values and Mixed Data Types on Social Sequence Clustering Using t-SNE Visualization

March 6, 2019

Optimizing I/O Performance of HPC Applications with Autotuning

February 28, 2019

Joint Sequence Analysis Challenges: How to Handle Missing Values and Mixed Variable Types

February 26, 2019

Network Traffic Performance Prediction with Multivariate Clusters in Time Windows

February 26, 2019

Parallel membership queries on very large scientific data sets using bitmap indexes

January 28, 2019

Proactive Data Containers (PDC): An object-centric data store for large-scale computing systems

December 13, 2018

Detecting Anomalies in the LCLS Workflow

December 11, 2018

Predicting Network Traffic Using TCP Anomalies

December 11, 2018

Dynamic Online Performance Optimization in Streaming Data Compression

December 10, 2018

ARCHIE: Data Analysis Acceleration with Array Caching in Hierarchical Storage

December 10, 2018

A Year in the Life of a Parallel File System

November 15, 2018

Identification of Network Data Transfer Bottlenecks in HPC Systems

November 14, 2018

Initial Characterization of I/O in Large-Scale Deep Learning Applications

November 13, 2018

SDN for End-to-end Networked Science at the Exascale (SENSE)

November 11, 2018

Automated Parallel Data Processing Engine with Application to Large-Scale Feature Extraction

November 10, 2018

IOMiner: Large-scale Analytics Framework for Gaining Knowledge from I/O Logs

September 10, 2018

More from SDM Publications »