|
Reprinted from the February 2, 2004 issue of BioInform.
Copyright © 2004 GenomeWeb LLC. All Rights Reserved.
They say you can't go home again, but Victor Markowitz is
hoping to prove that adage wrong. After leaving the data management
research and development group at Lawrence Berkeley National
Laboratory in 1997 to join Gene Logic as CIO, Markowitz is
heading back to the public sector, and hoping to bring a little
bit of hard-won industry know-how with him.
Markowitz is currently in the process of setting up the newly
formed Biological Data Management and Technology Center (BDMTC)
at LBNL a "virtual" center that will provide
informatics expertise for life science researchers at the
lab, as well as other Bay Area organizations, such as the
DOE's Joint Genome Institute, the University of California
Berkeley, and UC San Francisco. In addition, Markowitz said
the BDMTC is taking steps to become affiliated with the Institute
for Quantitative Biomedical Research (QB3), a multi-campus
effort based at UCSF, with additional facilities at UC Berkeley
and UC Santa Cruz. BDMTC also plans to seek partnerships with
IT companies "with a strong interest in the life sciences,"
Markowitz said.
The goal of the center, according to Markowitz, is to bring
a new level of "professional" data management support
to research groups in Northern California that have "little
experience handling large amounts of data." In addition,
he said, the center will work with academic software developers
to turn their innovative algorithms into "robust, maintainable
tools."
Markowitz said that jumping from the public sector into industry
presented him with the classic grass-is-greener scenario that
typifies the field of bioinformatics: "In industry, there
is a strong focus on developing quality products, but the
price of failure is too high, so it's not a good environment
for research. But academia is the reverse: Because you can
go in all kinds of [research] directions, the practice of
developing robust systems is sometimes lacking. So I posed
the question of whether it's possible to do both."
Last fall, as Gene Logic's focus moved away from software
and database development, and funding agencies such as the
NIH and DOE began pledging support for large-scale informatics
infrastructure projects, Markowitz decided the time was right
to strike out and build what he describes as a "bridge"
between those two worlds.
He negotiated an arrangement in which LBNL's Computational
Research Division will host the center, and will provide access
to the DOE's National Energy Research Scientific Computing
Center as well as available computational resources at the
organizations that the BDMTC supports. Funding will come from
the center's collaborators. Markowitz said his group is already
collaborating with the JGI on a project to build a data resource
for its environmental sequencing program, in which entire
communities of organisms are sequenced at once. JGI director
Edward Rubin told BioInform that the BDMTC's capabilities
are sorely needed. As the institute has ramped up its sequencing
capacity it now generates about 2 billion bases a month
"in some ways we've overgrown our capabilities
and need more of an industrial model, both in engineering
as well as in data management," Rubin said.
BDMTC is also included on a grant proposal submitted by UC
Berkeley under the NIH's National Centers for Biomedical Computing
program. If the proposal is accepted, BDMTC would provide
support for data management and software development, and
would account for a quarter of the total award.
BDMTC's operational model is unique within bioinformatics,
and Markowitz said he's working hard to impress upon potential
collaborators that the center "wants to be seen as a
part of what they are doing ... as an extension of their existing
capabilities." Dismissing any comparisons to a public-sector
consulting team as carrying "a negative connotation,"
Markowitz stressed that BDMTC is aiming for a "more symbiotic"
relationship with its partners than consultants generally
provide.
And if it's symbiosis that Markowitz wants, he'll be getting
a healthy dose of it in his first project with the JGI. "One
of the things we're very interested in is managing this new
data set of sequence that comes from sequencing environments,"
Rubin said. Unlike individual organisms, "the identifiers
are much more complex" for communities, and include additional
factors like pH, temperature, and location information. For
the data to be useful, Rubin said, "we're going to need
to query it in ways that don't fit with GenBank and normal
ways of displaying sequence data. So we're looking forward
to working with this center to be able to capture community
sequence data."
JGI already employs around 50 bioinformaticists around
one-third of its entire staff, Rubin said but the BDMTC
will provide expertise in large-scale data management that
will free up the JGI informatics staff for research, he said.
Markowitz said he's seeking "highly skilled professionals"
in the areas of software engineering, database modeling, data
warehousing, and other fields "more commonly encountered
in industry" to join the center's staff, which only numbers
a few people right now, but will be built out as collaborative
projects increase.
|