January 2002 Archives

Bruce Stewart

AddThis Social Bookmark Button

The O’Reilly Bioinformatics Technology conference concluded today in Tucson, and everyone is walking away happy.

The final day started out with a keynote from James Ostell, from the National Center for Biotechnology Information (NCBI), on the role of NCBI in the genome era. Ostell jokingly referred to the creation of the NCBI by an act of Congress in 1988 by pointing out that, unlike other bioinformatics research instituitons, “if we don’t do our job, we’re breaking the law.” Ostell went on to discuss the history of the NCBI and some of the principles they use to organize and prioritize their efforts. He used the Entrez retrieval system, a retrieval system for searching several linked databases, as an example of the kind of infrastructure work the NCBI is involved in.

Extra chairs were rushed in to seat the overflow audience attracted to Lincoln Stein’s talk on the Distributed Sequence Annotation System (DAS). Lincoln wowed the audience with his clear explanation of this potentially revolutionary technology. DAS is based on a client/server model that uses a common language for describing genome annotation information. The power of DAS lies in its ability to create a standard user interface to easily compare different group’s genome data, as well as to compare genomic annotations. The basic idea is that a DAS client queries a reference server that is identified for a genome, and then different annotation servers can be added to the client’s interface to make comparisons. The technology behind DAS is all open source and industry-standard, based on XML and traditional Web servers.

Perhaps commenting on Ewan Birney’s earlier remarks about how he has become unhappy with XML due to it’s high amount of overhead, Lincoln quipped “XML is the worst possible solution, except for all the other solutions.” While it is clear the overhead associated with XML can be problematic with some of the huge datasets used in bioinformatics, it seems to be working like a champ in DAS.

Finally, Dr. Leroy Hood delivered the closing keynote on his specialty, systems biology. Hood touched on many of the topics I discussed with him in a previous interview, explaining why biology has come to be considered an informational science. Hood commented that one of the fascinating aspects of biology is that it has at its core a digital language, which is ultimately knowable. Hood acknowledged the dramatic paradigm changes that the Human Genome Project has brought to biology, and he stressed that even more important than understanding the specific genome data will be the challenge of learning the mechanisms of the regulating networks that specify the behavior of the genomes.

It’s impossible to be everywhere at once at a conference, and the only complaint I heard was the difficulty in choosing which sessions to attend, because there were so many interesting options. One area I have neglected in my coverage is the Bioinformatics.Org track, which featured talks by the director, Jeff W. Bizarro, and contributing members of this open source bioinformatics organization. I did get a few minutes to talk with one of those members today, Warren DeLano, whose open source molecular visulaization program, PyMOL, is getting rave reviews here. DeLano has big plans to help create an open source environment for drug discovery research, and I’m sure we’ll be hearing more from him.

As the conference winds down, plans are already underway for next year’s Bioinformatics Technology conference. Plan for it to be sometime in the first quarter of 2003. Details will be published on the O’Reilly Conferences Web site as soon as they are available. And if you’re interested in the speaker notes and slide presentations from this year’s conference, keep checking the Bioinformatics Technology Conference page. They will be posted there soon.

Bruce Stewart

AddThis Social Bookmark Button

The halls are buzzing with activity here in Tucson and it’s clear lots of good connections are being made. The
job-posting board is full, and people have come from all over the world to network with other bioinformaticists and take in the conference talks. I’ve seen attendees from the U.K., Brazil, Germany, Denmark, and Egypt, as well as from all over the U.S. Judging from the intensity of some of the lunch-table conversations, learning is happening outside of the conference sessions as well as in them.

Yesterday’s keynotes looked at bioinformatics from a
big-picture point of view, but today’s speeches drilled down to more specific aspects of the scientific side of the field. Terry Gaasterland from Rockefeller University gave a detailed talk on data integration called “Integrating Gene Expression Data and Genome Sequence Data,” using genome examples from humans, mice, cattle parasites, and a pathogenic bacterium. Chris Hogue from the Samuel Lunenfeld Research Institute of Mt. Sinai Hospital gave an impressive afternoon keynote on “Saccharomyces cerevisiae: Some Assembly Required.” I spoke to a group of programmers who were visibly excited by the methods Hogue described.

After Hogue’s keynote I attended an interesting panel called “Open Data Open Source.” Four speakers from diverse backgrounds addressed the issues associated with making bioinformatics data and software open source. All four argued in favor of open source methods, with Ewan Birney echoing his statements from Monday that making the actual data open was of paramount importance. Although a couple of the speakers strayed from the subject a bit, I was especially impressed with Steven Brenner’s contribution to the panel.

Brenner is a contributor to the BioPerl project, and when he was invited to join the faculty at U.C. Berkeley he realized their standard employment contract wouldn’t allow him to continue his open source work. So he worked with the campus administration to craft a contract that would. The school eventually accommodated Brenner’s request, but only after much time and legal expense. Based on his experience Brenner is trying to rally an effort in the bioinformatics community to create a standard employment contract that will allow for work on open source projects. His feeling is that if a standard contract existed universities would be more willing to accept it, and each time a faculty member faced this situation they wouldn’t be starting from scratch. He’s championing an effort to raise awareness on this issue as well as funds from the open source bioinformatics community to pay the legal costs associated with creating such a standardized employment contract. For more information, visit the Open Bioinformatics Foundation site, where you can join the Open Source Authors Contract Working Group discussion list.

Later in the afternoon O’Reilly had an author-signing event. Not only were Perl luminaries Damian Conway, Lincoln Stein, and Nathan Torkington kept busy signing copies of their books, but our newer bioinformatics authors, Cynthia Gibas and James Tisdall were also quite popular. And, of course, many people crowded around Tim O’Reilly to get his take on bioinformatics and his signature on a book.

Tomorrow is the last day of the conference, and it promises more good keynotes and presentations. I’m especially looking forward to Lincoln Stein’s presentation on his latest pet project, the Distributed Sequence Annotation System (DAS), and the closing keynote by systems biologist, Dr. Leroy Hood. Check back tomorrow for reports on both.

Bruce Stewart

AddThis Social Bookmark Button

The second day of the O’Reilly Bioinformatics conference got under way with a distinctly open source flavor, as Ewan Birney delivered the morning keynote on Open Source Bioinformatics. Read our report on Ewan’s Keynote for more details. The crowd then broke up into smaller groups to attend a variety of morning sessions.

Two separate sessions on data visualization generated a lot of interest, as well as a talk on “High-Performance Proteomic Analysis: Challenges and Solutions.” Uwe Hilgert, a curriculum developer from the Dolan DNA Learning Center at the Cold Spring Harbor Laboratory, delivered a popular session on bioinformatics in education. This afternoon’s panel on “Academic Freedom Collides with the Bayh/Dole Amendment” should be a hot one for those interested in intellectual property issues as they relate to bioinformatics. I’m not going to miss it, and I’ll let you know how it went.

The afternoon keynote promises to be equally interesting, as Lincoln Stein is presenting a widely anticipated speech on “Bioinformatics: Building a Nation from a Land of City States.” Check back tomorrow for a detailed report on Lincoln’s speech. I’m also looking forward to tonight’s BOF meeting of the openinformatics.org folks, whose petition to require publicly-funded software to be licensed as open source has already caused quite a stir.

Last night we were treated to a little bit of lighter fare, as our own editor and conference co-chair Nathan Torkington emceed a hilarious Bioinformatics Quiz Show. Four brave panels of attendees tried their best to answer a wide range of questions, and learned the lesson that it was often a good idea to wait until the entire question was completed, as they sometimes took a strange turn.

Bruce Stewart

AddThis Social Bookmark Button

The hackers are hacking and the tutorials are packed here in Tucson as the O’Reilly Bioinformatics Technology Conference gets under way. The keynotes and conference sessions don’t start until tomorrow, but the energy is already palpable here in the hallways at the Westin La Paloma. The cross-breed of biologists and computer scientists known as bioinformaticists are eating up the day of tutorials, and an elite group of open source bioinformaticist hackers are coding up a storm in the first ever Bioinformatics Hackathon.

Most of the tutorials are completely full as these researchers and scientists soak up the latest info on the techniques and programming concepts that are becoming essential knowledge in this emerging field. O’Reilly has long been known for our strong support of Perl, and Perl has rapidly become the language of choice for bioinformatics research. The affinity this crowd has for Perl is obvious as the Perl-related tutorials are causing some of the most excitement here today. Peter Schattner’s tutorial on Perl and BioPerl: Tools for Automated Analysis of Biological Sequence Data was an early sell-out, and generated so much interest that a second class was added. Perl-guru Damian Conway led another packed tutorial on Parsing Perl for Bioinformatics. And for those just getting started with Perl, O’Reilly author James Tisdall held a popular workshop on Beginning Perl for Bioinformatics.

Ewan Birney, another active Perl coder and one of the key players behind the BioPerl project, is leading the hackathon and will also be giving the opening keynote Tuesday morning on Open Source Bioinformatics. O’Reilly is co-sponsoring the hackathon with South Africa-based Electric Genetics, and this event puts some of the most important bioinformatics developers together face-to-face in a room full of computers (along with copious amounts of pizza and coffee). Good things are bound to happen.

Lincoln Stein, another Perl legend, who wrote the popular cgi.pm module because of his needs in bioinformatics, will be giving a highly anticipated keynote tomorrow afternoon. Lincoln is also presenting a session on the Distributed Sequence Annotation System (DAS) which promises to allow genomic annotations to be shared by researchers all over the world. Check back daily for in-depth coverage of Ewan’s and Lincoln’s keynotes, important sessions, and other happenings here at the O’Reilly Bioinformatics Technology Conference.

Gordon Mohr

AddThis Social Bookmark Button

Related link: http://news.com.com/2100-1023-819619.html

The Kazaa/FastTrack crew has sold the Kazaa web site, software, and other assets to a private Australian company, Sharman Networks [NEWS.com]. What will this mean for the future of the Kazaa Service and the legal actions proceeding against it?

Some thoughts:

  • Downloads of the Kazaa software, halted a few days ago pending legal action, have been resumed. Is Sharman more confident of their legal defense than the previous owners of Kazaa?
  • The Kazaa website front page asks people to review “new” terms of use, but beside the insertion of “Sharman Networks Limited”, it’s not clear that these are substantially different from the old terms, still available under an unlinked old URL.
  • FastTrack apparently continues as an independent technology house — as the sale is reported to include a license to use the FastTrack core, rather than the core intellectual property itself. However, the FastTrack site has less info than ever before, and even background-info pages at old URLs have been removed.
  • Could perhaps the corporate ownership of the Kazaa network be constantly rotated
    across jurisdictions, always one step ahead of the sheriff? Can we expect a future news story: “Sharman Networks Limited today said it had sold
    the KaZaA assets to privately-held Mauritius-based Cottonelle Systems.
    The move comes just weeks after copyright industries filed suit in
    Australian courts against Sharman.”

Hey RIAA and MPAA, don’t squeeze the Sharman!

How do you expect this latest development to affect Kazaa and the pending lawsuits of the copyright industries?

Bruce Stewart

AddThis Social Bookmark Button

Related link: http://www.salon.com/tech/feature/2002/01/04/university_open_source/index.html

This Salon article looks at some of the hurdles university researchers are having to get over to release their code with open source licenses, and the ramifications of the Bayh-Dole Act, which allows universities doing publicly funded research to own and sell the intellectual property they produce.

Dr. Steven Brenner, the leader of a computational genomics research group at UC Berkeley, points out that many academic bioinformatics researchers may not realize the legal risk they face by contributing to open source software projects, and that some biological open source software is probably being produced illegally. Brenner is one of the founders of the Bioperl project, and he didn’t want to give this up to take a position with UC Berkeley. Brenner was able to successfully renegotiate his UC employment contract so that he could continue to contribute to open source projects, but the process was long and tedious, and it was apparent to him that he was charting new ground for the university’s licensing office. So Brenner got together with colleagues at the Open Bioinformatics Foundation, the umbrella organization for all the bio* projects, to help promote awareness of these issues, which he believes academic researchers don’t understand as well as their industry counterparts do.

Here at O’Reilly we’re strong supporters of open source software, and I sat up and took notice when I saw another group of bioinformatics researchers promoting a petiton to require all publicly funded software projects to carry open source licenses. I admit my first reaction to this was that it made perfect sense–after all, if we’re paying for the code, we should get to use and see it, right?

As the debate over the OpenInformatics.org petition heated up on the O’Reilly bioinformatics discussion list, I realized that the issue wasn’t that simple. One poster in particular, an open source contributor and Open Bioinformatics Foundation member named Andrew Dalke, made especially convincing and eloquent arguments for why this requirement might not be such a good idea.

If these topics interest you–and they certainly apply to a far wider audience than just the bioinformatics crowd–stay tuned. Next week we’ll have articles on the O’Reilly Network covering both sides of this issue. Jason E. Stewart and Harry Mangalam are coauthors of the petition, and they’ll tell you why they think you should support their efforts to require open source licenses for code generated by public research. Andrew Dalke presents an opposing viewpoint, with lots of interesting reasons for thinking twice about what may initially seem like a “no-brainer”.

Ewan Birney will deliver a keynote on Open Source Bioinformatics at the upcoming O’Reilly Bioinformatics Technology Conference. Andrew Dalke will be presenting on Biopython, Steven Brenner will be participating in a panel on Open Data and Open Source, and Jason Stewart will lead a Birds-of-a-Feather session on the OpenInformatics.org petition as well.

Do you think software generated by publicly funded research should be licensed as open source?

Bruce Stewart

AddThis Social Bookmark Button

Related link: http://www.economist.com/science/tq/displayStory.cfm?Story_id=885127

This Economist article examines the emerging discipline of computational physiology (I guess “physioinformatics” was just a bit too much of a mouthful), and describes the advances that have been made in developing virtual organs. By applying complex computer models to biological entities researchers hope to create simulations that will rapidly advance medical science.

One of the challenges of computational physiology is that few of the existing physiological models and their associated databases can currently communicate with one another. Sounds like a problem for XML, right? Well, the Physiome Sciences group in Princeton and the Bioengineering Research Group at the University of Auckland agree, and are developing CellML, an XML-based markup language to store and exchange computer-based biological models.