SAN DIEGO -- Picture a machine so awkwardly designed that almost every software execution leads to a race condition; that survives intense magnetic fields and mild radiation but crashes when you expose it to excessive UV light; that incorporates bugs into its own design and devotes 90 percent of its hard-drive memory to self-replicating white noise.
Still trying to picture it? Here's a hint: Go look in the mirror.
Such was the opening analogy in Jim Kent's Thursday morning keynote address at the 2002 O'Reilly Open Source Convention. Setting aside for a moment the usual talk about DNA base-pair sequencing and protein folding, Kent decided to reframe his discussion in a language that most of the software-savvy convention attendees could understand.
"Bioinformatics is about unraveling 3 billion bases of the world's worst spaghetti code," said Kent, research scientist at the University of California, Santa Cruz's Computational Biology Group. "It's about deciphering a software program that was never designed, a software program where all symbols are global, a software program whose compile time is 18 years."
The speech, the second in a split-bill featuring Ewan Birney of the European Bioinformatics Institute, attracted a heavy turnout, a testament to the growing buzz surrounding the field of bioinformatics, a rapidly emerging field lying at the intersection of biotechnology and computer science. In a scene that seemed to sum up the current contrast between bioinformatics and the rest of high-tech industry, both Birney and Kent were mobbed by job-seeking programmers after Birney revealed in the opening keynote that both he and almost every other project leader were hiring.
"I've got four slots open, and we've got better beer over in the U.K.," said Birney.
Birney's willingness to dangle jobs in front of the OSCON audience was itself a sign of bioinformatics researchers' extreme regard for open source technologies. Birney described his ENSEMBL project, a joint effort to develop a software program to sort through and annote eukaryotic genome data, as "open source to the core" and described himself as a "Perl addict." Beyond that, Birney said, the chief advantage of open source software in the bioinformatics sphere is the ethical overlap between peer-reviewed software and peer-reviewed research.
"For us, it's straight scientific principles," Birney said. "If you want to be a scientist, open up your data and open up the code that helps you work with that data."
Such comments drew loud applause from the gathered audience. The other statement to earn a similar response was Jim Kent's cautionary statement in regards to genomic manipulation. Depicting the human genome as vast terra incognita that scientists were only beginning to explore, Kent said it was foolhardy at best to think that humans could in any way improve on 3 billion years worth of trial-and-error research. "I actually believe we should know what we're doing before we start tweaking the genome," Kent said.
Kent, a researcher described as "legendary" by fellow keynoter Birney, first earned fame in the spring of 2000 for writing the program GigAssembler. Faced with the risk that Celera Genomics, a private biotech company, might scoop the publicly funded human genome project in announcing the full sequence of the human genome, Kent hastily wrote GigAssembler in the space of four weeks. Assembling the disparate genome fragments into a single genome sequence and publishing the assembly algorithm in the journal Genome Research, Kent offered researchers a way to access the entire genome for free, as opposed to paying a $1 million for licensed access to the Celera's proprietary human genome database.
"I always say that bioinformatics is sort of like the system administrator for biotechnology," Kent said.
Expanding the analogy, Kent likened the current demand for bioinformatics programmers to the demand for Web programmers in 1995. Revealing his cautionary side, however, Kent said he hoped prospective job applicants would take the full weight of that analogy to heart.
"The opportunities are there, but you need to understand biology pretty deeply," said Kent. "You don't have to be as extreme as me and go the Ph.D. route, but you do have to invest a little time and get to know the discipline."
Sam Williams is a freelance writer living in Brooklyn, New York, and the author of O'Reilly's Free as in Freedom: Richard Stallman's Crusade for Free Software. He has covered high-tech culture, specifically software-development culture, for a number of Web sites.
Return to the O'Reilly Network.
Copyright © 2009 O'Reilly Media, Inc.