Skip to navigation Skip to content
Careers | Phone Book | A - Z Index
Scientific Data Management Research

Arie Shoshani

arie.png
Arie Shoshani
Affiliate, Retired Senior Staff Scientist

Dr. Arie Shoshani was a senior staff scientist at Lawrence Berkeley National Laboratory. He joined LBNL in 1976. He heads the Scientific Data Management Group. He received his Ph.D. from Princeton University in 1969. From 1969 to 1976, he was a researcher at System Development Corporation, where he worked on the Network Control Program for the ARPAnet, distributed databases, database conversion, and natural language interfaces for data management. His current areas of interest are data models, query languages, temporal data, and statistical and scientific database management, and storage management on tertiary storage.

Dr. Shoshani has published over 150 technical papers in refereed journals and conferences; chaired several workshops, conferences, and panels in database management; and served on numerous program committees for various database conferences. He also served as an associate editor for the ACM Transactions on Database Systems. He was elected a member of the VLDB Endowment Board, served as the Publication Board Chairperson for the VLDB Journal, and is currently the Vice-President of the VLDB Endowment.

» Visit Arie Shoshani's personal web page.

Journal Articles

Beytullah Yildiz, Kesheng Wu, Suren Byna, Arie Shoshani, "Parallel membership queries on very large scientific data sets using bitmap indexes", Concurrency and Computation: Practice and Experience, January 1, 2019, 31:e5157,

Many scientific applications produce very large amounts of data as advances in hardware fuel computing and experimental facilities. Managing and analyzing massive quantities of scientific data is challenging as data are often stored in specific formatted files, such as HDF5 and NetCDF, which do not offer appropriate search capabilities. In this research, we investigated a special class of search capability, called membership query, to identify whether queried elements of a set are members of an attribute. Attributes that naturally have classification values appear frequently in scientific domains such as category and object type as well as in daily life such as zip code and occupation. Because classification attribute values are discrete and require random data access, performing a membership query on a large scientific data set creates challenges. We applied bitmap indexing and parallelization to membership queries to overcome these challenges. Bitmap indexing provides high performance not only for low cardinality attributes but also for high cardinality attributes, such as floating‐point variables, electric charge, or momentum in a particle physics data set, due to compression algorithms such as Word‐Aligned Hybrid. We conducted experiments, in a highly parallelized environment, on data obtained from a particle accelerator model and a synthetic data set.

Deborah A Agarwal, Boris Faybishenko, Vicky L Freedman, Harinarayan Krishnan, Gary Kushner, Carina Lansing, Ellen Porter, Alexandru Romosan, Arie Shoshani, Haruko Wainwright, others, "A science data gateway for environmental management", Concurrency and Computation: Practice and Experience, 2016, 28:1994--2004,

Elaheh Pourabbas, Arie Shoshani, "The Composite Data Model: A Unified Approach for Combining and Querying Multiple Data Models", IEEE Trans. Knowl. Data Eng, 2015, 27(5):1424-1437,

G. F. Lofstead, Q. Liu, J. Logan, Y. Tian, Abbasi, N. Podhorszki, J. Y. Choi, S., R. Tchoua, R. A. Oldfield, others, "Hello ADIOS: The Challenges and Lessons of Leadership Class I/O Frameworks", 2012,

J. Gu, D. Katramatos, X. Liu, V. Natarajan, A. Shoshani, A. Sim, D. Yu, S. Bradley, S. McKee, "StorNet: Integrated Dynamic Storage and Network Resource Provisioning and Management for Automated Data Transfers", Journal of Physics: Conf. Ser., 2011, 331, doi: 10.1088/1742- 6596/331/1/012002

Kesheng Wu, Rishi R Sinha, Chad Jones, Stephane Ethier, Scott Klasky, Kwan-Liu Ma, Arie Shoshani, Marianne Winslett, "Finding regions of interest on toroidal meshes", Computational Science \& Discovery, 2011, 4:015003,

E. Pourabbas, A. Shoshani, "Improving Estimation Accuracy of Aggregate Queries on Data Cubes", Data & Knowledge Engineering 69 (2010), January 1, 2010, 69:50-72,

D. N. Williams, R. Ananthakrishnan, D. E. Bernholdt, S. Bharathi, D. Brown, M. Chen, A. L. Chervenak, L. Cinquini, R. Drach, I. T. Foster, P. Fox, D. Fraser, J. Garcia, S. Hankin, P. Jones, D. E. Middleton, J. Schwidder, R. Schweitzer, R. Schuler, A. Shoshani, F. Siebenlist, A. Sim, W. G. Strand, M. Su, N. Wilhelmi, "The Earth System Grid: Enabling Access to Multimodel Climate Simulation Data", American Meteorological Society, 2009, 90(2):195-205,

M. Riedel, E. Laure, Th. Soddemann, L. Field, J. P. Navarro, J. Casey, M. Litmaath, J. Ph. Baud, B. Koblitz, C. Catlett, D. Skow, C. Zheng, P. M. Papadopoulos, M. Katz, N. Sharma, O. Smirnova, B. Kónya, P. Arzberger, F. Würthwein, A. S. Rana, T. Martin, M. Wan, V. Welch, T. Rimovsky, S. Newhouse, A. Vanni, Y. Tanaka, Y. Tanimura, T. Ikegami, D. Abramson, C. Enticott, G. Jenkins, R. Pordes, N. Sharma, S. Timm, N. Sharma, G. Moont, M. Aggarwal, D. Colling, O. van der Aa, A. Sim, V. Natarajan, A. Shoshani, J. Gu, S. Chen, G. Galang, R. Zappi, L. Magnoni, V. Ciaschini, M. Pace, V. Venturi, M. Marzolla, P. Andreetto, B. Cowles, S. Wang, Y. Saeki, H. Sato, S. Matsuoka, P. Uthayopas, S. Sriprayoonsakul, O. Koeroo, M. Viljoen, L. Pearlman, S. Pickles, David Wallom, G. Moloney, J. Lauret, J. Marsteller, P. Sheldon, S. Pathak, S. De Witt, J. Mencák, J. Jensen, M. Hodges, D. Ross, S. Phatanapherom, G. Netzer, A. R. Gregersen, M. Jones, S. Chen, P. Kacsuk, A. Streit, D. Mallmann, F. Wolf, T. Lippert, Th. Delaitre, E. Huedo, N. Geddes, "Interoperation of world-wide production e-Science infrastructures", Concurrency and Computation: Practice and Experience, 2009, 21(8):961-990,

John Shalf and Jason Hick (Arie Shoshani and Doron Rotem), "Storage Technology Fundamentals", Scientific Data Management: Challenges, Technology, and Deployment, Volume . Chapman & Hall/CRC, 2009,

P. Jakl, J. Lauret, A. Hanushevsky, A. Shoshani, A. Sim, J. Gu, "Grid data access on widely distributed worker nodes using scalla and SRM", Journal of Physics: Conf. Ser., 2008, 119, doi: 10.1088/1742-6596/119/7/072019

C S Chang, S Klasky, J Cummings, R. Samtaney, A Shoshani, L Sugiyama, D Keyes, S Ku, G Park, S Parker, N Podhorszki, H. Strauss, H Abbasi, M Adams, R Barreto, G Bateman, K Bennett, Y Chen, E D’Azevedo, C Docan, S Ethier, E Feibush, L Greengard, T Hahm, F Hinton, C Jin, A. Khan, A Kritz, P Krsti, T Lao, W Lee, Z Lin, J Lofstead, P Mouallem, M Nagappan, A Pankin, M Parashar, M Pindzola, C Reinhold, D Schultz, K Schwan, D. Silver, A Sim, D Stotler, M Vouk, M Wolf, H Weitzner, P Worley, Y Xiao, E Yoon, D Zorin, "Toward a first- principles integrated simulation of tokamak edge plasmas", Journal of Physics: Conf. Ser., 2008, 125, doi: 10.1088/1742-6596/125/1/012042

R Ananthakrishnan, D E Bernholdt, S Bharathi, D Brown, M Chen, A L Chervenak, L Cinquini, R Drach, I T Foster, P Fox, D Fraser, K Halliday, S Hankin, P Jones, C Kesselman, D E Middleton, J Schwidder, R Schweitzer, R Schuler, A Shoshani, F Siebenlist, A Sim, W G Strand, N Wilhelmi, M Su, D N Williams, "Building a global federation system for climate change research: the earth system grid center for enabling technologies (ESG-CET)", Journal of Physics: Conf. Ser., 2008, 78, doi: 10.1088/1742-6596/78/1/012050

F. Donno, L. Abadie, P. Badino, J. Baud, E. Corso, M. Crawford, S. De Witt, A. Forti, P. Fuhrmann, G. Grosdidier, J. Gu , J. Jensen, S. Lemaitre, M. Litmaath, D. Litvinsev, G. Lo Presti, L. Magnoni, T. Mkrtchan, A. Moibenko, V. Natarajan, G. Oleynik, T. Perelmutov, D. Petravick, A. Shoshani, A. Sim, M. Sponza, R. Zappi, "Storage Resource Manager version 2.2: design, implementation, and testing experience", Journal of Physics: Conf. Ser., 2007, 119, doi: 10.1088/1742-6596/119/6/062028

Elaheh Pourabbas, Arie Shoshani, "Efficient Estimation of Joint Queries from Multiple OLAP Databases", ACM Transactions on Database Systems (TODS), March 1, 2007, Volume 3,

Kesheng Wu, Ekow J Otoo, Arie Shoshani, "Optimizing bitmap indices with efficient compression", ACM Transactions on Database Systems (TODS), 2006, 31:1--38,

D. Bernholdt, S. Bharathi, D. Brown, K. Chanchio, M. Chen, A. Chervenak, L. Cinquini, B. Zrach, I. Foster, P. Fox, J. Garcia, C. Kesselman, R. Markel, D. Middleton, V. Nefedova, L. Pouchard, A. Shoshani, A. Sim, G. Strand, D. Williams, "The Earth System Grid: Supporting the Next Generation of Climate Modeling Research", IEEE, 2005, 93(3):485-495,

Ann L. Chervenak, Ewa Deelman, Carl Kesselman, William E. Allcock, Ian T. Foster, Veronika Nefedova, Jason Lee, Alex Sim, Arie Shoshani, Bob Drach, Dean Williams, Don Middleton, "High-performance remote access to climate simulation data: a challenge problem for data grid technologies", Parallel Computing, 2003, 29(10):1335-1356,

Elaheh Pourabbas, Arie Shoshani, "Joint Queries Estimation from Multiple OLAP Databases", International Conference on Scientific and Statistical Database Management, 2002 (SSDBM’02), July 24, 2002,

A. Sim, H. Nordberg, L.M. Bernardo, A. Shoshani, D. Rotem, "Experience with using CORBA to implement a file caching coordination system", Concurrency and Computation: Practice and Experience, 2001, 13:1-15,

L Bernardo, H Nordberg, D Olson, A Shoshani, A Sim, A Vaniachine, D Zimmerman, B Gibbard, R Porter, T Wenaus, others, "New capabilities in the HENP grand challenge storage access system and its application at RHIC", Computer physics communications, 2001, 140:179--188,

Conference Papers

Spyros Blanas, Kesheng Wu, Surendra Byna, Bin Dong, Arie Shoshani, "Parallel Data Analysis Directly on Scientific File", SIGMOD 14, 2014, 385--396, doi: 10.1145/2588555.2612185

Spyros Blanas, Kesheng Wu, Surendra Byna, Bin Dong, Arie Shoshani, "Parallel Data Analysis Directly on Scientific File Formats", SIGMOD 14, 2014, 385--396, doi: 10.1145/2588555.2612185

DP Schissel, Gheni Abla, SM Flanagan, M Greenwald, X Lee, A Romosan, A Shoshani, J Stillerman, J Wright, "Automated metadata, provenance cataloging and navigable interfaces: Ensuring the usefulness of extreme-scale data", Fusion Engineering and Design, North-Holland, 2014,

Qian Sun, Fan Zhang, Tong Jin, Hoang Bui, Kesheng Wu, Arie Shoshani, Hemanth Kolla, Scott Klasky, Jacqueline Chen, Manish Parashar, "Scalable run-time data indexing and querying for scientific simulations", Big Data Analytics: Challenges and Opportunities (BDAC-14) Workshop at Supercomputing Conference, 2014,

Spyros Blanas, Kesheng Wu, Surendra Byna, Bin Dong, Arie Shoshani, "Parallel data analysis directly on scientific file formats", Proceedings of the 2014 ACM SIGMOD international conference on Management of data, January 1, 2014, 385--396,

Alex Romosan, Arie Shoshani, Kesheng Wu, Victor Markowitz, Kostas Mavrommatis, "Accelerating gene context analysis using bitmaps", Proceedings of the 25th International Conference on Scientific and Statistical Database Management, 2013, 1--12, LBNL 6397E,

Karen L. Schuchardt, Deborah A. Agarwal, Stefan A. Finsterle, Carl W. Gable, Ian Gorton, Luke J. Gosink, Elizabeth H. Keating, Carina S. Lansing, Joerg Meyer, William A.M. Moeglein, George S.H. Pau, Ellen A. Porter, Sumit Purohit, Mark L. Rockhold, Arie Shoshani, and Chandrika Sivaramakrishnan, Akuna, "Integrated Toolsets Supporting Advanced Subsurface Flow and Transport Simulations for Environmental Management", XIX International Conference on Computational Methods in Water Resources (CMWR 2012), University of Illinois at Urbana-Champaign, June 2012,

Surendra Byna, Jerry Chou, Oliver Rubel, Homa Karimabadi, William S Daughter, Vadim Roytershteyn, E Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, others, "Parallel I/O, analysis, and visualization of a trillion particle simulation", SC 12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, January 2012, 1--12,

Elaheh Pourabbas, Arie Shoshani, Kesheng Wu, "Minimizing index size by reordering rows and columns", International Conference on Scientific and Statistical Database Management, January 2012, 467--484,

Benson Ma, Arie Shoshani, Alex Sim, Kesheng, Yong-Ik Byun, Jaegyoon Hahm, Min-Su Shin, "Efficient Attribute-Based Data Access in Astronomy", The 2nd International Workshop on Network-Aware Data Workshop (NDM2012), 2012, 562--571,

Karen L. Schuchardt, Deborah A. Agarwal, Stefan A. Finsterle, Carl W. Gable, Ian Gorton, Luke J. Gosink, Elizabeth H. Keating, Carina S. Lansing, Joerg Meyer, William A.M. Moeglein, George S.H. Pau, Ellen A. Porter, Sumit Purohit, Mark L. Rockhold, Arie Shoshani, Chandrika Sivaramakrishnan, "Akuna-Integrated Toolsets Supporting Advanced Subsurface Flow and Transport Simulations for Environmental Management", XIX International Conference on Computational Methods in Water Resources (CMWR 2012), University of Illinois at Urbana-Champaign, June 17-22, 2012, 2012,

A. Shoshani, I. Altintas, J. Chen, G. Chin, A. Choudhary, D. Crawl, T. Critchlow, K. Gao, B. Grimm, H. Iyer, C. Kamath, A. Khan, S. Klasky, S. Koehler, S. Lang, R. Latham, J. W. Li, W. Liao, J. Ligon, Q. Liu, B. Ludaescher, P. Mouallem, M. Nagappan, N. Podhorszki, R. Ross, D. Rotem, N. Samatova, C. Silva, A. Sim, R. Tchoua, R. Thakur, M. Vouk, K. Wu, W. Yu, "The Scientific Data Management Center: Available Technologies and Highlights", SciDAC Conference, 2011,

Junmin Gu, Dimitrios Katramatos, Xin Liu, Vijaya Natarajan, Arie Shoshani, Alex Sim, Dantong Yu, Scott Bradley, Shawn McKee, "StorNet: Co-Scheduling of End-to-End Bandwidth Reservation on Storage and Network Systems for High Performance Data Transfers", IEEE INFOCOM HSN 2011, 2011,

Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E Wes Bethel, Arie Shoshani, Oliver R\ ubel, Rob D Ryne, "Parallel index and query for large scale data analysis", Proceedings of 2011 international conference for high performance computing, networking, storage and analysis, 2011, 1--11, LBNL 5317E,

Kesheng Wu, Surendra Byna, Doron Rotem, Arie, "Scientific Data Services -- A High-Performance I/O with Array Semantics", HPCDB, IEEE, 2011, doi: 10.11v45/2125636.2125640

J. Chou, K. Wu, O. R\ ubel, M. Howison, Qiang, Prabhat, B. Austin, E. W. Bethel, D. Ryne, A. Shoshani, "Parallel Index and Query for Large Scale Data", SC11, 2011, doi: 10.1145/2063384.2063424

Jinoh Kim, Hasan Abbasi, Luis Chac\ on, Docan, Scott Klasky, Qing Liu, Norbert, Arie Shoshani, Kesheng Wu, "Parallel In Situ Indexing for Data-intensive", LDAV, 2011, 65--72, doi: 10.1109/LDAV.2011.6092319

Dean N. Williams, Ian T. Foster, Don E. Middleton, Rachana Ananthakrishnan, Neill Miller, Mehmet Balman, Junmin Gu, Vijaya Natarajan, Arie Shoshani, Alex Sim, Gavin Bell, Robert Drach, Michael Ganzberger, Jim Ahrens, Phil Jones, Daniel Crichton, Luca Cinquini, David Brown, Danielle Harper, Nathan Hook, Eric Nienhouse, Gary Strand, Hannah Wilcox, Nathan Wilhelmi, Stephan Zednik, Steve Hankin, Roland Schweitzer, John Harney, Ross Miller, Galen Shipman, Feiyi Wang, Peter Fox, Patrick West, Stephan Zednik, Ann Chervenak, Craig Ward, "Earth System Grid Center for Enabling Technologies (ESG-CET): A Data Infrastructure for Data-Intensive Climate Research", SciDAC Conference, 2011,

Alex Sim, Mehmet Balman, Dean N. Williams, Arie Shoshani, Vijaya Natarajan, "Adaptive Transfer Adjustment in Efficient Bulk Data Transfer Management for Climate Datasets", The 22nd IASTED International Conference on Parallel and Distributed Computing and System, Marina Del Rey, CA, November 20, 2010, LBNL 3985E,

Many scientific applications and experiments, such as high energy and nuclear physics, astrophysics, climate observation and modeling, combustion, nano-scale material sciences, and computational biology, generate extreme volumes of data with a large number of files. These data sources are distributed among national and international data repositories, and are shared by large numbers of geographically distributed scientists. A large portion of the data is frequently accessed, and a large volume of data is moved from one place to another for analysis and storage. A challenging issue in such efforts is the limited network capacity for moving large datasets. A tool that addresses this challenge is the Bulk Data Mover (BDM), a data transfer management tool used in the Earth System Grid (ESG) community. It has been managing massive dataset transfers efficiently in the environment where the network bandwidth is limited. Adaptive transfer adjustment was studied to enhance the BDM to handle significant end-to-end performance changes in the dynamic network environments as well as to control the data transfers for the desired transfer performance. We describe the results from our hands-on data transfer management experience in the climate research community. We study a practical transfer estimation model and state our initial results from the adaptive transfer adjustment methodology. 

Mehmet Balman, Evangelos Chaniotakis, Arie Shoshani, Alex Sim, "A Flexible Reservation Algorithm for Advance Network Provisioning", ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, November 2010 (SC'10)., New Orleans, LA, IEEE Computer Society Washington, DC, USA ISBN: 978-1-4244-7559-, November 14, 2010, LBNL 4017E, doi: http://dx.doi.org/10.1109/SC.2010.4

Many scientific applications need support from a communication infrastructure that provides predictable performance, which requires effective algorithms for bandwidth reservations. Network reservation sys- tems such as ESnet’s OSCARS, establish guaranteed bandwidth of secure virtual circuits for a certain bandwidth and length of time. However, users currently cannot inquire about bandwidth availability, nor have alternative suggestions when reservation requests fail. In general, the number of reservation options is exponential with the number of nodes n, and current reservation commitments. We present a novel approach for path finding in time-dependent networks taking advantage of user-provided parameters of total volume and time constraints, which produces options for earliest completion and shortest duration. The theoretical complexity is only O(n2r2) in the worst-case, where r is the number of reservations in the desired time interval. We have implemented our algorithm and developed efficient methodologies for incorporation into network reservation frameworks. Performance measurements confirm the theoretical predictions. 

Julian Cummings, Jay Lofstead, Karsten Schwan, Alexander Sim, Arie Shoshani, Ciprian Docan, Manish Parashar, Scott Klasky, Norbert Podhorszki, Roselyne Barreto, "EFFIS: An End-to-end Framework for Fusion Integrated Simulation", 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010,

Kesheng Wu, Arie Shoshani, Kurt Stockinger, "Analyses of multi-level and multi-component compressed indexes", ACM Transactions on Database Systems, ACM, 2010, 35:1--52, doi: 10.1145/1670243.1670245

K Wu, S Ahern, EW Bethel, J Chen, H Childs, C Geddes, J Gu, H Hagen, B Hamann, J Lauret, others, "FastBit: Interactively Searching Massive Data", Proc. of SciDAC 2009, 2009, LBNL 2164E,

D. N. Williams, R. Ananthakrishnan, D. E. Bernholdt, S. Bharathi, D. Brown, M. Chen, A. L. Chervenak, L. Cinquini, R. Drach, I. T. Foster, P. Fox, S. Hankin, V. E. Henson, P. Jones, D. E. Middleton, J. Schwidder, R. Schweitzer, R. Schuler, A Shoshani, F. Siebenlist, A. Sim, W. G. Strand, N. Wilhelmi, M. Su, "Data Management and Analysis for the Earth System Grid", SciDAC Conference, 2008,

Rishi Rakesh Sinha, Marianne Winslett, Kesheng, Kurt Stockinger, Arie Shoshani, "Adaptive Bitmap Indexes for Space-Constrained", ICDE 2008, 2008, 1418--1420,

Kesheng Wu, Kurt Stockinger, Arie Shoshani, "Breaking the curse of cardinality on bitmap indexes", International Conference on Scientific and Statistical Database Management, 2008, 348--365,

Meiyappan Nagappan, Mladen A. Vouk, Kesheng Wu Alex Sim, Arie Shoshani, "Efficient Operational Profiling of Systems Using Arrays on Execution Logs", ISSRE, 2008, 313--314, doi: 10.1109/ISSRE.2008.45

L. Abadie, P. Badino, J. Baud, E. Corso, M. Crawford, S. De Witt, F. Donno, A. Forti, P. Fuhrmann,
G. Grosdidier, J. Gu , J. Jensen, S. Lemaitre, M. Litmaath, D. Litvinsev, G. Lo Presti, L. Magnoni, T. Mkrtchan, A. Moibenko, V. Natarajan, G. Oleynik, T. Perelmutov, D. Petravick, A. Shoshani, A. Sim, M. Sponza, R. Zappi,
"Storage Resource Managers: Recent International Experience on Requirements and Multiple Co-Operating Implementations", the 24th IEEE Conference on Mass Storage Systems and Technologies, 2007,

Frederick Reiss, Kurt Stockinger, Kesheng Wu, Shoshani, Joseph M. Hellerstein, "Enabling Real-Time Querying of Live and Historical Data", SSDBM 2007, 2007,

Elaheh Pourabbas, Arie Shoshani, "The Composite OLAP-Object Data Model: Removing an Unnecessary Barrier", International Conference on Scientific and Statistical Database Management (SSDBM) 2006, July 3, 2006, 291-300,

D. E. Middleton, D. E. Bernholdt, D. Brown, M. Chen, A. L. Chervenak, L. Cinquini, R. Drach, P. Fox, P. Jones, C. Kesselman, I. T. Foster, V. Nefedova, A. Shoshani, A. Sim, W. G. Strand, D. Williams, "Enabling worldwide access to climate simulation data: the earth system grid (ESG)", SciDAV Conference, 2006,

P. Jakl, J. Lauret, A. Hanushevky, A. Shoshani, A. Sim, "From rootd to Xrootd, from physical to logical files: experience on accessing and managing distributed data", Computing in High Energy Physics (CHEP), 2006,

E. Hjort, L. Hajdu, J. Lauret, D. Olson, A. Sim, A. Shoshani, "Data and Computational Grid Coupling in RHIC/STAR – An Analysis Scenario using SRM Technology", Computing in High Energy Physics (CHEP), 2006,

F. Reiss, K. Stockinger, K. Wu, A. Shoshani J. M. Hellerstein, "Efficient analysis of live and historical streaming and its application to cybersecurity", 2006,

A. Shoshani, A. Sim, K. Stockinger, "RRS: Replica Registration Service for Data Grids", International Workshop on Data Management in Grids, 2005,

Kesheng Wu, Junmin Gu, Jerome Lauret, Arthur Poskanzer, Arie Shoshani, Alexander Sim, Zhang, "Grid Collector: Facilitating Efficient Selective from Data Grids", International Supercomputer Conference 2005, 2005,

Eric Hjort, Doug Olson, Jerome Lauret, Arie Shoshani, Alex Sim, "Production mode Data- Replication framework in STAR using the HRM Grid middleware", Computing in High Energy Physics, 2004,

Alex Sim, Junmin Gu, Arie Shoshani, Vijaya Natarajan, "DataMover: Robust Terabytes-Scale Multi-file Replication over Wide-Area Networks", the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), 2004,

Kesheng Wu, Wei-Ming Zhang, Victor, Jerome Lauret, Arie Shoshani, "The Grid Collector: Using an Event Catalog to Speed up Analysis in Distributed Environment", Proceedings of Computing in High Energy and Nuclear (CHEP) 2004, 2004,

Elaheh Pourabbas, Arie Shoshani, "Answering Joint Queries from Multiple Aggregate OLAP Databases", Data Warehousing and Knowledge Discovery, 5th International Conference, DaWaK 2003, September 3, 2003, 24-34,

A. Sim, J. Gu, A. Shoshani, E. Hjort, D. Olson, "Experience with Deploying Storage Resource Managers to Achieve Robust File Replication", Computing in High Energy Physics, 2003,

D. Yu, J. Lauret, A. Shoshani, D. Oldon, E. Hjort, A. Sim, "The Design of High Performance Data Replication in the Grid Environment for the STAR Collaboration", Computing in High Energy Physics, 2003,

L. Pouchard, L. Cinquini, B. Drach, D. Middleton, D. Bernholdt, K. Chanchio, I. Foster, V. Nefedova, D. Brown, P. Fox, J. Garcia, G. Strand, D. Williams, A. Chervenak, C. Kesselman, A. Shoshani, A. Sim, "An Ontology for Scientific Information in a Grid Environment: the Earth System Grid", the Symposium on Cluster Computing and the Grid (CCGrid), 2003,

Kesheng Wu, Wei-Ming Zhang, Alexander Sim, Gu, Arie Shoshani, "Grid Collector: An Event Catalog With Automated File", Proceedings of IEEE Nuclear Science Symposium 2003, 2003, doi: 10.1109/NSSMIC.2003.1351830

A. Shoshani, A. Sim, J. Gu, "Storage Resource Managers: Middleware components for Grid Storage", the 19th IEEE Symposium on Mass Storage Systems, 2002,

B. Allcock, I. Foster, V. Nefedova, A. Chervenak, E. Deelman, C. Kesselman, J. Lee, A. Sim, A. Shoshani, B. Drach, D. Williams, "High-Performance Remote Access to Climate Simulation Data: A Challenge Problem for Data Grid Technologies", Super Computing 2001, 2001,

D. Olson, E. Hjort, J. Lauret, M. Messer, A. Shoshani, A. Sim, "Non-shared Disk Cluster - A Fault Tolerant, Commodity Approach to Hi-Bandwidth Data Analysis", Computing in High Energy Physics, 2001,

A. Shoshani, A. Sim, L.M. Bernerdo, H. Nordberg, "Coordinating Simultaneous Caching of File Bundles from Tertiary Storage", International Conference on Scientific and Statistical Database Management (SSDBM), 2000,

L. M. Bernardo, B. Gibbard, D. Malon, H. Nordberg, D. Olson, R. Porter, A. Shoshani, A. Sim, A. Vaniachine, T. Wenaus, K. Wu, D. Zimmerman, "New Capabilities in the HENP Grand Challenge Storage Access System and its Application at RHIC", Computing in High Energy Physics, 2000,

L. M. Bernardo, A. Shoshani, A. Sim, H. Nordberg, "Access Coordination Of Tertiary Storage For High Energy Physics Applications", the 17th IEEE Symposium on Mass Storage Systems, 2000,

A. Sim, H. Nordberg, L. M. Bernardo, A. Shoshani, D. Rotem, "Storage Access Coordination Using CORBA", Distributed Objects and Application, 1999, 168-175,

A. Shoshani, L.M. Bernardo, H. Nordberg, D. Rotem and A. Sim, "Multidimensional Indexing and Query Coordination for Tertiary Storage Management", International Conference on Scientific and Statistical Database Management, 1999, 214-225,

A. Shoshani, L.M. Bernardo, H. Nordberg, D. Rotem, A. Sim, "Storage Management for High Energy Physics Applications", Computing in High Energy Physics, 1998,

Books

Scientific Data Management: Challenges, Technology, and Deployment, edited by Arie Shoshani and Doron Rotem, (Chapman & Hall/CRC Computational Science: December 2009)

Book Chapters

A. Sim, D. Gunter, V. Natarajan, A. Shoshani, D. Williams, J. Long, J. Hick, J. Lee, E. Dart, "Efficient Bulk Data Replication for the Earth System Grid", Data Driven E-science: Use Cases and Successful Applications of Distributed Computing Infrastructures (ISGC 2010), (Springer-Verlag New York Inc: 2010) Pages: 435

Arie Shoshani, Flavia Donno, Junmin Gu, Jason Hick, Maarten Litmaath, Alex Sim, "Dynamic Storage Management", Scientific Data Management: Challenges, Technology, and Deployment, edited by Arie Shoshani, Doron Rotem, (Chapman & Hall/CRC Computational Science: 2009)

Kurt Stockinger, John Cieslewicz, Kesheng Wu, Rotem, Arie Shoshani, "Using Bitmap Indexing Technology for Combined and Text Queries", Annals of Information Systems, (Springer: 2008) Pages: 1--23

A. Shoshani, A. Sim, K. Stockinger, "RRS: Replica Registration Service for Data Grids", Lecture Notes in Computer Science, edited by Jean-Marc Pierson, (Springer-Verlag GmbH Publisher: 2006) Pages: 100-112

Arie Shoshani, Alexander Sim, Junmin Gu, "Storage Resource Managers: Essential Components for the Grid", Grid Resource Management: State of the Art and Future Trends, edited by Jarek Nabrzyski, Jennifer M. Schopf, Jan Weglarz, (Kluwer Academic Publishers: 2003)

Presentation/Talks

Arie Shoshani, Alex Sim, Junmin Gu, Storage Resource Managers: Essential Components for Grid Applications, Globus World, 2003,

A. Sim, A. Shoshani, HRM: Hierarchical Resource Manager, Globus World, 2000,

A. Sim, A. Shoshani, L. M. Bernardo, H. Nordberg, A Storage Access Coordination System for Perabyte Scale Scientific Data, IONA World, 2000,

Reports

D. Yu, D. Katramatos, A. Shoshani, A. Sim, J. Gu, V. Natarajan, "StorNet: Integrating Storage Resource Management with Dynamic Network Provisioning for Automated Data Transfer", International Committee for Future Accelerators (ICFA) Standing Committee on Inter-Regional Connectivity (SCIC) 2012 Report: Networking for High Energy Physics, 2012,

M. Balman, E. Chaniotakis, A. Shoshani, A. Sim, "A New Approach in Advance Network Reservation and Provisioning for High-Performance Scientific Data Transfers", 2010, LBNL 4091E,

K. Wu, K. Stockinger, A. Shoshani, Wes, "FastBit--Helps Finding the Proverbial Needle in a", 2006, LBNL LBNL-PUB/963,

Arie Shoshani, Alex Sim, Kurt Stockinger, "Replica Registration Service Functional Interface Specification 1.0", 2005, LBNL 57520,

Kesheng Wu, Ekow J Otoo, Arie Shoshani, "An efficient compression scheme for bitmap indices", 2004,

Kesheng Wu, Wei-Ming Zlang, Alexander Sim, Junmin Gu, Arie Shoshani, "Grid collector: An event catalog with automated file management", 2003 IEEE Nuclear Science Symposium. Conference Record (IEEE Cat. No. 03CH37515), 2003, LBNL 55563,

L.M. Bernardo, D. Rotem, A. Shoshani, H. Nordberg, A. Sim, "Using Access Patterns to Partition Large Datasets on Tertiary Storage in Order to Minimize Retrieval Costs", 1998, LBNL 41504,

Posters

Xiaocheng (Chris) Zou, Suren Byna, Hans Johansen, Daniel Martin, Nagiza F. Samatova, Arie Shoshani, John Wu, "Six-fold Speedup of Ice Calving Detection Achieved by AMR-aware Parallel Connected Component Labeling", SciDAC PI Meeting, July 2015, 2015,

Others

US Patent 8,705,342 B2. “Co-scheduling of network resource provisioning and host-to-host bandwidth reservation on high-performance network and storage systems”, D. Yu, D. Katramatos, A. Sim, and A. Shoshani, Apr. 22, 2014, LBNL IB-3152, BNL BSA 11-02.

John C Wright, Martin Greenwald, Joshua Stillerman, Gheni Abla, Bobby Chanthavong, Sean Flanagan, David Schissel, Xia Lee, Alex Romosan, Arie Shoshani, The MPO API: A tool for recording scientific workflows, Fusion Engineering and Design, 2014,

A. Sim, A. Shoshani, F. Donno, J. Jensen, Storage Resource Manager Interface Specification V2.2 Implementations Experience Report, Open Grid Forum, GFD.154, 2009,

Alex Sim, Arie Shoshani (Editors), Paolo Badino, Olof Barring, Jean‐Philippe Baud, Ezio Corso, Shaun De Witt, Flavia Donno, Junmin Gu, Michael Haddox‐Schatz, Bryan Hess, Jens Jensen, Andy Kowalski, Maarten Litmaath, Luca Magnoni, Timur Perelmutov, Don Petravick, Chip Watson, The Storage Resource Manager Interface Specification Version 2.2, Open Grid Forum, Document in Full Recommendation, GFD.129, 2008,

Kesheng Wu, Kurt Stockinger, Arie Shoshani, Performance of Multi-Level and Multi-Component Bitmap Indexes, 2007, doi: 10.1145/1670243.1670245

K. Wu, A. Shoshani, E. J. Otoo, Word aligned bitmap compression method, data and apparatus, US Patent 6,831,575, 2004,

Kurt Stockinger, Kesheng Wu, Arie Shoshani, Evaluation Strategies for Bitmap Indices with, International Conference on Database and Expert Applications (DEXA 2004), Zaragoza, Spain, 2004,

Kesheng Wu, Ekow Otoo, Arie Shoshani, Compressing Bitmap Indexes for Faster Search, Proceedings of SSDBM 02, Pages: 99--108 2002,

Kurt Stockinger, Kesheng Wu, Arie Shoshani, Strategies for processing ad hoc queries on large data, Proceedings of DOLAP 02, Pages: 72--79 2002,

Kesheng Wu, Ekow J Otoo, Arie Shoshani, A performance comparison of bitmap indexes, Proceedings of the tenth international conference on Information and knowledge management, Pages: 559--561 2001,