|
August 2, 1999
BERKELEY, CA -- In its first half year of operation, a new database
that identifies clusters of proteins arising from alternative gene
splicing has received more than 35,000 requests from researchers
in genetics and cell and developmental biology around the world.
The Alternative Splicing Data Base, or ASDB, is based at the National
Energy Research Scientific Computing Center (NERSC) in the Department
of Energy's Lawrence Berkeley National Laboratory. It was created
by Inna Dubchak, Igor Dralyuk, and Manfred Zorn of NERSC's Center
for Bioinformatics and Computational Genomics, in collaboration
with M.S. Gelfand of the Institute of Protein Research at the Russian
Academy of Sciences.
"In the first weeks after ASDB went on line in January, requests
for data went from an average of a few dozen per day to hundreds,"
says Dubchak. "One day in May, we got more than 6,000 requests."
The world-wide demand for alternative gene-splicing data is confirmation
that Dubchak and her colleagues have hit upon one of the most exciting
and important problems in contemporary biology -- which suits her
fine: "We want to help biologists solve their hardest problems by
computational methods."
Genes that can be spliced alternately to produce different proteins
violate what, not long ago, was considered a basic tenet of biology
-- "one gene, one protein." But it is now clear that alternative
splicing plays a crucial role in the development and health of many
organisms.
Several steps lie between a gene -- a sequence of nucleotides on
a strand of DNA -- and the protein for which it codes. Messenger
RNA copies the gene, then carries the information to a ribosome.
The ribosome reads the RNA and cranks out an amino-acid string,
which folds into the functional protein.
In 1977 researchers found that with some genes, after the messenger
RNA leaves the DNA strand and before it is processed by a ribosome,
large chunks of it are edited out. The discarded pieces represent
stretches of the gene (later named introns) that do not code for
amino acids; sequences that actually do code for amino acids are
called exons. Sixteen years after their discovery of these "split"
genes, Richard Roberts and Phillip Sharp won the Nobel Prize in
1993.
Split genes have a remarkable property: their exons can be added
or deleted, giving rise to different proteins from the same gene.
This alternative splicing plays a vital role in most higher organisms;
in the development of the fruit fly, a single split gene arranged
one way eventually produces a female, but if arranged another way
produces a male.
Split genes are also important in generating the numerous "impromptu"
variations of antibodies produced by the human immune system in
response to novel infectious agents. And splicing variations have
been found to result in some cancers as well. Alternative splicing
in humans is not rare -- almost a third of human genes are subject
to it.
Dubchak and her colleagues spent a year and a half assembling the
ASDB, which currently contains some 1,700 protein sequences. It
can be searched to find out how many known proteins can be derived
from a single gene sequence (some can generate up to 64 variations
of messenger RNA!) or to find all known products of alternative
splicing in a given organism, such as the fruit fly, mouse, or human,
or in a particular tissue such as muscle, heart, or brain.
The mechanism of alternative splicing is not well understood; relative
concentrations of antagonistic "splicing factors," including small
proteins in the cell nucleus, are an important factor. And no one
really knows the origins or evolutionary reasons for the persistence
of introns, although conflicting theories abound.
By understanding the variations in proteins that result from shuffling
exons in split genes, answers to these and a host of other questions
may emerge. This promise is what drives the ever-increasing use
of the Alternative Splicing Data Base, which can be reached at http://cbcg.nersc.gov/asdb.
Gelfand, Dubchak, Dralyuk, and Zorn give a detailed description
of ASDB in the January 1, 1999, issue of Nucleic Acids Research
(vol 27, no 1).
The Berkeley Lab is a U.S. Department of Energy national laboratory
located in Berkeley, California. It conducts unclassified scientific
research and is managed by the University of California. Visit our
website at http://www.lbl.gov.
|