News from the Bioconductor Project

Bioconductor Team
Program in Computational Biology
Fred Hutchinson Cancer Research Center

We are pleased to announce Bioconductor 2.4, released on April 21, 2009. Bioconductor 2.4 is compatible with R 2.9.0, and consists of 320 packages. There are 28 new packages, and enhancements to many others. Explore Bioconductor at http://bioconductor.org, and install packages with

> source("http://bioconductor.org/biocLite.R")
> biocLite() # install standard packages...
> biocLite("IRanges") # ...or IRanges

1 New packages

This release includes powerful new packages for diverse areas of high-throughput analysis, including:

Refined differential expression: power analysis, pre-processing and error estimation (SSPA, dyebias, spkTools, Rmagpie, MCRestimate).
Flow cytometry: tools for data import (flowflowJo) and auto-gating (flowStats).
Advanced clustering and gene selection: approaches (Rmagpie, MCRestimate, GeneSelectMMD, tspair, metahdep, betr).
Probabilistic graphical models: for reverse engineering regulatory networks (qpgraph).
Pathway analysis: using novel approaches (KEGGgraph, geen2pathway, GOSemSim, SPIA).
Technology-specific: packages (AffyTiling, rMAT, crlmm, GeneRegionScan).
Interfaces: to data base and other external resources (biocDatasets, PAnnBuilder, DAVIDQuery).

2 Annotation

Bioconductor ‘annotation’ packages contain biological information about microarray probes and the genes they are meant to interrogate, or contain ENTREZ gene based annotations of whole genomes. This release updates existing database content, and lays the groundwork for 4 new species: Pan troglodytes, Macaca mulatta, Anopheles gambiae and Xenopus laevis. These species will be available in the development branch starting in May. In addition, the ‘yeast’ package now contains NCBI identifiers. A similarly enhanced Arabidopsis package will be in the development branch in May.

3 High-throughput sequencing

The stable of tools for high-throughput sequence analysis has developed considerably during this release cycle, particularly data structures and methods for conveniently navigating this complex data. Examples include the IRanges and related classes and methods for manipulating ranged (interval-based) data, the Rle class and its rich functionality for run-length encoded data (e.g., genome-scale ‘pileup’ or coverage data), the XDataFrame class allowing data frame-like functionality but with more flexible column types (requiring only that the column object have methods for length and subsetting), and the GenomeData and GenomeDataList objects and methods for manipulating collections of structured (e.g., by chromosome or locus) data. The Biostrings package continues to provide very flexible pattern matching facilities, while ShortRead introduces new I/O functionality and the generation of HTML-based quality assessment reports from diverse data sources.

4 Other activities

Bioconductor package maintainers and the Bioconductor team invest considerable effort in producing high-quality software. A focus during development of Bioconductor 2.4 has been on more consistent and widespread use of name spaces and package imports. These changes reduce ‘collisions’ between user and package variable names, and make package code more robust. The Bioconductor team continues to ensure quality software through technical and scientific reviews of new packages, and daily builds of released packages on Linux, Windows, and Macintosh platforms. The Bioconductor web site is also evolving. Bioconductor ‘views’ describing software functionality have been re-organized, and package vignettes, reference manuals, and use statistics are readily accessible from package home pages.

5 Looking forward

The Bioconductor community will meet on July 27-28 at our annual conference in Seattle for a combination of scientific talks and hands-on tutorials. The active Bioconductor mailing lists (http://bioconductor.org/docs/mailList.html) connect users with each other, to domain experts, and to maintainers eager to ensure that their packages satisfy the needs of leading edge approaches.

This will be a dynamic release cycle. New contributed packages are already under review, and our build machines have started tracking the latest development versions of R. In addition to development of high-quality algorithms to address microarray data analysis, we anticipate continued efforts to leverage diverse external data sources and to meet the challenges of presenting high volume data in rich graphical contexts.

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

News from the Bioconductor Project

1 New packages

2 Annotation

3 High-throughput sequencing

4 Other activities

5 Looking forward

Bioconductor packages used

Note

Reuse

Citation