Arie Shoshani
Head,
Scientific Data Management Group Tel:
(510) 486-5171
High
Performance Computing Research Department Fax:
(510) 486 -4004
Computational
Research Division Email:
shoshani@lbl.gov
Lawrence
Berkeley National Laboratory http://www.lbl.gov/~arie
Education
· Ph.D. Computer Sciences,
· M. A. Computer Sciences,
· B. S. (Summa cum Laude), Control
Engineering, Technion -- Israel Institute of Technology,
1965.
Position History
· Senior Staff Computer Scientist,
Lawrence Berkeley National Laboratory (LBNL), Berkeley, California,
1976-present.
· Computer systems Specialist, System
Development Corporation, Santa Monica, California, 1969-1976.
Awards and Honors
· Keynote speaker, Efficient Indexing
Technology for Data Mining of Scientific Data, Keynote Talk, Fifth IEEE
International Conference on Data Mining, November 2005.
· Best paper award, International
Supercomputer Conference,
· Patent, Word-Aligned Hybrid compression
method. US Patent 6,831,575. 2004, with
John Wu and E. Otto.
· Hottest
Infrastructure Award – SuperComputing 2000 Network
Challenge - A Data
Management Infrastructure for Climate Modeling Research (a collaboration of
several laboratories)
· Elected to the Very Large Data Bases
(VLDB) Endowment Board, 1988-1996.
· Elected Vice-President of the Very Large
Data Bases (VLDB) Endowment Board, 1997-1998.
· Best paper award, ACM-SIGMOD Conference
on Management of Data, 1987.
· Chairman of Steering Committee for the
Scientific and Statistical Data Base Management (SSDBM) Conference,
1982-present.
· General Chairman of the Fourteenth VLDB
Conference, 1998.
· Associate editor for the ACM
Transactions on Database Systems (TODS), 1982 – 1986.
· Keynote Speaker, 5th Conference on
Knowledge and Information Management Conference (CKIM), 1996.
· Invited tutorial speaker, Symposium on
Principles of Database Systems (PODS), 1997.
Research Interests
Semantic
data models, query languages, temporal data, efficient access from tertiary
storage, statistical and OLAP databases, and database techniques for scientific
database applications.
Narrative
I have been the head of the Scientific
Data Management Research Group at LBNL since 1978. Our research activities focus on the
development of algorithms and software for the organization, access and
manipulation of scientific databases (SDBs). Our areas of research fall into three main
categories: logical modeling and user interfaces (which include modeling of SDBs, query languages, graphical user interfaces, and
modeling of temporal, sequence, and multi-dimensional data), physical
organization and access methods (which include bitmap indexing of scientific data,
temporal data structures, and multi-dimensional data structure), and algorithms
for special SDB operators (such as sampling, transposition, and
aggregation). Currently, I am the
director of a Scientific Data Management (SDM)
The Scientific Data Management research
group that I am heading has been very productive and visible in the research
community. Our group has been and
continues to be involved in practical projects (such as the Human Genome, a
Climate modeling, combustion modeling, High Energy Physics, and others), and
has been applying their research results by providing prototype software to
real scientific data management problems.
Our work has established the fields of Statistical Data Management and
Scientific Data Management as important research areas with unique challenging
problems. We have initiated the
conferences on Statistical and Scientific Data Base Management (SSDBM). I am continuing to serve as the chair of the
steering committee for this conference.
In 1998, a product that was developed in my group, called the OPM
database tools, was commercialized, and has been used by biotech and
pharmaceutical companies. We continue to
use this product in projects in my group.
More recently a patent was awarded to two members of my group and myself for developing a highly efficient specialized bitmap
indexing method, which is deployed in various projects.
In addition to
management and administrative duties, my own technical work is mainly in the
characterization of SDBs unique requirements, query
languages, modeling of statistical data, temporal data, sequence data,
multi-dimensional data, and data compression.
More recently, I have been involved with several scientific projects,
including Storage Resource Management (SRM) for the Grid, a microbial meta-database,
distributed access of climate modeling data, and
bitmap indexing and organization of High Energy Physics data on tertiary
storage. I have been and continue to be
involved in many professional activities outside the Laboratory, Including
chairing and participating on various program committees. I have published over
70 papers in refereed Journals and conferences.
Selected
Publications
·
Elaheh Pourabbas, Arie Shoshani: Efficient estimation of joint
queries from multiple OLAP databases. ACM Trans. Database Syst. 32(1): 2 (2007)
·
·
Impact
of Admission and Cache Replacement Policies on Response Times of Jobs on Data
Grids, Ekow Otoo, Doron
Rotem and Arie Shoshani, Cluster Computing Journal,
Springer, October 2005, pp. 293-303.
·
RRS:
Replica Registration Service for Data Grids, Arie Shoshani, Alex Sim, Kurt Stockinger, Proceedings of VLDB Workshop on Data Management
in Grids (VLDB-Grids'05), September 2005.
·
Co-Scheduling
of Computation and Data on Computer Clusters, A. Romosan, D. Rotem, A. Shoshani and D. Wright, Proceedings of the
Conference on Scientific and Statistical Database Management (SSDBM 2005).
·
On the performance of bitmap indices
for high cardinality attributes,
Kesheng Wu, Ekow J. Otoo, Arie Shoshani, International conference on Very Large
Data Bases (VLDB 2004) 24-35
·
DataMover: Robust
Terabyte-Scale Multi-file Replication over Wide-Area Networks, A. Sim, J. Gu,
A. Shoshani, V. Natarajan, Scientific and Statistical
Database Management conference (SSDBM 2004), 403-411.
·
Storage
Resource Managers: Essential Components for the Grid, Arie
Shoshani, Alexander Sim, and Junmin
Gu, chapter in book: Grid Resource Management:
State of the Art and Future Trends, Edited by Jarek Nabrzyski, Jennifer M. Schopf,
Jan weglarz, Kluwer
Academic Publishers, 2003
·
Using Bitmap Index for Interactive
Exploration of Large Datasets.
Kesheng Wu, Wendy S. Koegler,
Jacqueline Chen, Arie Shoshani, Scientific and Statistical Database Management
conference (SSDBM 2003), 65-74.
·
A Performance Comparison of bitmap
indexes, Kesheng
Wu, Ekow J. Otoo, Arie
Shoshani, ACM International Conference on Information and Knowledge Management
(CIKM’01), 559-561.
·
Storage Resource Managers: Middleware Components
for Grid Storage, ·Arie
Shoshani, Alex Sim, Junmin Gu, Nineteenth IEEE Symposium on Mass Storage Systems, 2002
(MSS '02).
·
Extending OLAP Querying to External
Object Databases, T. Pedersen,
A. Shoshani, J. Gu, C.S.
Jensen, 9th International Information and Knowledge Management (CKIM'00).
·
Coordinating Simultaneous Caching of
File Bundles from Tertiary Storage, A. Shoshani, A. Sim, L. M. Bernardo, H.
Nordberg, Scientific and Statistical Database Management conference (SSDBM
2000).
·
Storage Management Techniques for Very
Large Multidimensional Datasets,
A. Shoshani, L. M. Bernardo, H. Nordberg, D. Rotem,
and A. Sim, Eleventh International Conference on
Scientific and Statistical Database Management (SSDBM 1999).
·
Determining the Optimal File Size on Tertiary
Storage Systems Based on the Distribution of Query Sizes, L. Bernardo, H. Nordberg, D. Rotem,
and A. Shoshani, Tenth International Conference on Scientific and Statistical
Database Management, (SSDBM 1998).
·
Summarizability in OLAP and Statistical
Databases, (with H. Lenz),
Ninth International Conference on Scientific and Statistical Database
Management (SSDBM 1997).
·
OLAP and Statistical Databases:
Similarities and Differences,
in Proceedings of the Symposium on Principles of Database Systems (PODS) 1997
(invited tutorial).
·
A Temporal Data Model Based on Time
Sequences, (with A. Segev), book chapter in Temporal Databases: Theory, Design,
and Implementation, Edited by A. Tansel, J. Clifford,
S. Gadia, S. Jajodia, A. Segev, and R. Snodgrass, Benjamin/Cummings, 1993.
·
Representing Extended
Entity-Relationship Structures in Relational Databases: A Modular Approach, (with V. Markowitz), ACM Trans. on Database Systems,
17, 3 (September 1992), pp. 423-464.
·
Logical Modeling of Temporal Data, Best Paper Award, (with A. Segev),
Proceedings of the International Conference on Management of Data (SIGMOD),
May 1987, (Best Paper Award).