My personal Davos moment was in a session I moderated yesterday, a discussion of bioinformatics -- the blending of computation and biology. Several brilliant speakers, including William Haseltine, Mark Levin, James Sabry and Nathan Myhrvold gave the audience a quick tour of the field and tried to answer some key questions.None of them was willing to predict an era when there will be drugs that work on just one person -- a kind of personal pill. But they all agreed that we're not far away from solving some extremely big problems in health care.
Paul Allen , who came to the session, asked a key question but didn't get a satisfactory response. He wanted to hear about intellectual property issues -- in particular, if I understood his question, whether the growing secrecy in science is harming progress. Science is supposed to be about sharing data, not hoarding it. Not anymore, and that's an alarming trend.
I wrote an email response to Dan and Paul, with a cc to Timo Hannay of Nature, who had worked with us on program planning for our recent Bioinformatics Technology Conference. My message was about the momentum that open source has in the bioinformatics field. But before I get to that, I want to share Timo's response to Dan's first paragraph, about personalized medicine:
Just last week Nature included a paper which, I think, illustrates well both the exciting things we can do now and the distance we still have to go to realise 'personalized medicine'. It used DNA microarray technology to look at gene expression profiles in breast cancer cells. It found that there were two distinct -- and previously unrecognised -- forms of the disease. What's more, one form turned out to be susceptible to current treatments while the other wasn't. So they can now predict clinical outcomes from gene expression data, which is tremendous. But they've only managed to split breast cancer sufferers into two groups, so the road to true personalization is still a long one. The Economist has a good write-up of this research.
But back to the question of the growing secrecy in science. Here's what I wrote to Dan and Paul, and later decided to share more publicly here (email modified slightly to make links inline rather than explicit):
I just got back from the O'Reilly Bioinformatics Technology Conference. Open source and open data was a major theme, especially in keynotes from Ewan Birney of the Open Bioinformatics Institute and Lincoln Stein of Cold Harbor Springs Laboratory. (For a summary of Ewan's talk, see A Case for Open Source Bioinformatics, and for Lincoln's, see Building a Bioinformatics Nation.) Both summaries also have links off to interviews with the respective speakers, and in Lincoln's case, to his slides.Lincoln's talk was especially interesting because he focused not just on open source but on open data and open web services. His point was that we need to agree on common data formats and protocols so that independent projects can interoperate.
And Timo Hannay, one of the Nature editors, is working on a short piece explicitly comparing the scientific process to open source. Timo's insight, which I think is a good one, is that scientific papers are not akin to open source projects, but to patches on open source projects. The underlying science is the project, and the peer-reviewed papers are analogous to patches to the underlying software.
This isn't quite what you were asking, Paul, but it's an important part of the picture.
The conference also included a number of sessions directly on the topic of secrecy vs. openness. Nature, which was one of the co-sponsors of the conference, hosted a panel on scientific publishing, in which this was the focus. We agreed that data hoarding is a real danger, but in bioinformatics, we do also see the countervailing force of open source.
As I pointed out in an article I wrote for Linux Magazine, the most significant work of open source last year was James Kent's heroic effort to make sure the human genome sequence was in the public domain rather than the property of a private company.
Dan, I'm sure you're also aware of the stand that Steven Brenner has taken. [Steven made his right to publish his results a condition of his employment at UC Berkeley, and has developed a model open source/open data contract between academics and their institutions.] Steven was also at the conference, and talked about this on a panel there. We want to help him get his open source contract for academics out and more widely used. Any help publicizing it would be welcome.
It's really clear that there are some real issues here, but there are people taking up the guerdon on behalf of openness as well as those who are working for secrecy and private advantage. So I'm hopeful that in the end, openness will win.
Especially in a field like bioinformatics, the natural advantages of open source really do outweigh the advantages of secrecy. No one controls all the data. Talk after talk at the conference focused on the way that matching up data from other researcher's databases is the key to making sense out of your own data. This was a key focus of Terry Gaasterland's keynote as well, and of course is at the very heart of Lincoln Stein's DAS (Distributed Annotation System).
Timo Hannay replied to this message with some further comments (which express his own views and not necessarily those of Nature):
I think it's debatable whether science is becoming more or less open. Certainly, we've seen the rise of dubious (to say the least) patent claims on things like genes. And we've seen a rising desire by academic institutions to make the most of commercial opportunities that come out of their research. But, on the other hand, I think there's a trend, at least among biologists, for the scientists themselves to be more open with their data. Traditionally, biologists have hoarded their data because it takes a lot of effort to gather and they don't want rival groups to beat them to important findings hidden in the numbers. Previously they got away with this attitude partly because the logistical costs of sharing data were high. In the age of the Internet this is no longer true. This is why Nature now requires its authors to deposit all relevant gene and protein sequence data in appropriate public repositories. We're starting to do the same with microarray data too, but the repositories and data standards in this area are less well developed, so we're not yet able to be as firm or prescriptive as we can with gene and protein sequence data. Many researchers are still somewhat resistant to all this and sometimes we have to compromise (e.g., by allowing them to make their data public only after a delay of a few weeks or months). If we didn't do this, some of these outstanding scientists would simply publish elsewhere, so we're treading a fine line between promoting openness and maintaining our editorial pre-eminence.
Perhaps the highest-profile case in recent years was the publication of the Human Genome Project's paper in Nature last February:
At the same time, Science magazine published a rival paper from Celera's privately funded sequencing project. Science agreed to allow Celera to publish its paper but keep its sequences private. I believe that Science seriously damaged its scientific reputation by doing this -- and quite right too. Nature editorialised on this subject at the time.
So I guess that I'm agreeing with Tim that the open source mentality is a growing force in science.
Tim O'Reilly is the founder and CEO of O'Reilly Media, Inc., thought by many to be the best computer book publisher in the world. In addition to Foo Camps ("Friends of O'Reilly" Camps, which gave rise to the "un-conference" movement), O'Reilly Media also hosts conferences on technology topics, including the Web 2.0 Summit, the Web 2.0 Expo, the O'Reilly Open Source Convention, the Gov 2.0 Summit, and the Gov 2.0 Expo. Tim's blog, the O'Reilly Radar, "watches the alpha geeks" to determine emerging technology trends, and serves as a platform for advocacy about issues of importance to the technical community. Tim's long-term vision for his company is to change the world by spreading the knowledge of innovators. In addition to O'Reilly Media, Tim is a founder of Safari Books Online, a pioneering subscription service for accessing books online, and O'Reilly AlphaTech Ventures, an early-stage venture firm.
oreillynet.com Copyright © 2006 O'Reilly Media, Inc.