BioPerl List Summary, Dec 2-8 2002
|Email weblog link|
Subscription information: http://www.bioperl.org/MailList.shtml
Period: December 2-8, 2002
AnnouncementsHarry Mangalam announced O'Reilly's 2003 Bioinformatics Technology Conference, February 3-6, 2003 in San Diego, California.
Code ChangesProgress toward 1.2 (deadline Dec 31) came fast and furious.
Bio::Tools::RemoteBlast moved back into core. There was a long discussion about creating a bioperl-retired for legacy modules such as Bio::Tools::Blast.pm (which isn't the module you should be using to do Blast work).
Jason said that HMMer parsing is "more robustly being tested".
Steve Chervitz was to have added psiblast parsing functionality to blast.pm so psiblast.pm can be obsoleted for 1.2. In his words "I'm having doubts because it is a fairly major undertaking, and there's time pressure to release 1.2." Ewan concurred: "I think this has to happen in the 1.3 to 1.4 series."
Elia will handle bioperl-run regression testing against the 1.2 core as it solidifies.
Ewan committed reorganized documentation for Bio::Seq, Bio::SeqI, Bio::PrimarySeqI and Bio::PrimarySeq to stress that Bio::Seq has the best docs and is the best starting point for users.
Paul Boutros posted a great summary of 'make test' for the current BioPerl on Windows.
Unanswered QuestionsAllen Day asked whether anyone was working on a script to load GO terms into chado.
James Wasmuth had a problem with queries and hits that nobody replied to.
BugsJason Stajich posted an update to the MakeMaker bug--no immediate solution, a simple workaround, and a plea for more people to hack on MakeMaker.
Lincoln went back and forth with Paul Boutros trying to track down a Bio::DB::Query problem under Windows. Paul found the problem and Lincoln supplied a fix that made it pass all tests. Joy spread through the world like the smell of pot at a Grateful Dead concert.
Tyler wanted to run ProtDist.pm from two simultaneous scripts with the same working directory. Jason pointed out that phylip doesn't let you change the output filename. He committed a hack^Wfix to CVS.
Philip Lijnzaad reported a bug in BioPerl 1.0.2, which Hilmar reported fixed for 1.2.
Paul Boutros continued to try and track down a bug with the LocusLink IO code. Allen Day opined it was a \r\n problem.
Answered QuestionsScott Cain asked how to create an empty Bio::Seq object once you know its length. Hilmar replied that you can do this with 1.1.1 or later.
Charles Hauser asked how to trick Bio::SearchIO into thinking a tab-delimited blast data file was a blast report, to extract accession and description from the subj_name field. Mathieu replied suggesting simply ripping the description header parser code from the module might be easier.
James Wasmuth asked how to write an alignment into a string instead of a stream. Ewan pointed him to IO::String, and Jason even supplied code.
Nat Goodman asked what speedbumps people faced with BioPerl. Rob Edwards sent in an excellent list: lack of standardized names, unobvious return values, the ease with which you can call an undefined method while parsing genBank files, and the overwhelming amount of What's In There. Ewan replied suggesting a Lazy Programmer's Guide to BioPerl.
Jason pointed Mike Pheasant toward Ensembl and Bio::SeqFeatures::Gene::GeneStructure objects for help "merging exons in all transcripts of a gene to come up with a new non-overlapping set of start/end coordinates for all sequences that end up in any mRNA for that gene" (whew!).
Eicke Felipe had a problem that turned out to be caused by mixing 1.0.2 and a later version.
Yee Man asked whether Ewan would extend his protein Smith-Waterman module to DNA. Ewan replied that it's in the Wise2 package but there's no XS bridge for it yet. Yee volunteered to write said bridge. Yee also threw out this cool URL:
$Bio::Tools::Run::StandAloneBlast::DATADIR = '/localcopy';as an alternative way to do it.
David Vilanova asked how to add subseqfeatures when reconstructing a genbank annotation file. Jason explained how the code and concepts have changed (and showed sample code) and ended with a plea: "It would be nice if someone would map the SeqFeature::Gene::GeneStructureI objects into something similar to the below structure for direct output by the SeqIO objects."
Renata Melo asked for information about using Smith-Waterman using DSM with MPI or other parallel tools. Aarom Mackey pointed to sesarch34 from the fasta distribution which has PVM and MPI implementations.
Nat Torkington is conference planner for the Open Source Convention, OSCON Europe, and other O'Reilly conferences. He was project manager for Perl 6, is on the board of The Perl Foundation, and is a frequent speaker on open source topics. He cowrote the bestselling Perl Cookbook.
Return to weblogs.oreilly.com.