Saprotrophics: a new natural habitat for bioinformaticians?

Lennart Martens, Ghent University.

A large amount of public data is available, covering various aspects of biology and biomedicine. As a result, it is becoming increasingly interesting to perform meta-analyses on these data, foregoing the traditional need to first obtain novel data from a dedicated analysis or experiment. Indeed, the information encoded across many different experiments often provides access to knowledge that could not have been extracted from a single or a few experiments. At the same time, the signal in these experiments can easily be lost in the noise, as heterogeneity at all levels poses a substantial threat to the re-use or re-interpretation of large, unrelated collections of data. It is therefore paramount to employ purpose-built filters and/or quality control prior to repurposing existing data from the public domain. Such efforts are however hampered by the often fragmentary annotation of these various data sets, and it is therefore often necessary to second-guess a lot of the context. Based on the specific example of proteomics data, I will show that such challenges can be met, starting from the creation of a system for data exchange, all the way to the application of public data to extract novel knowledge and insight, both at the technical as well as the biological level. In the end, such analyses can create a wholly new branch of 'saprotrophic' bioinformatics, in which data that have already served their intended purpose for the original authors, is essentially endlessly recycled in novel useful roles from public domain databases.

Dr. Martens has been leading the Computational Omics and Systems Biology (CompOmics) Group at VIB and Ghent University, both in Ghent, Belgium, since October 2009. Prof. Martens obtained his Ph.D. in Sciences: Biotechnology from Ghent University, and afterwards served as PRIDE Group Coordinator at EMBL-EBI before returning to Ghent University and VIB in his current position.

You can found out more about him on his website.

Scale matters! - The importance of scale in the analysis of chromatin landscapes, mutation profiles and protein network architecture

Jeroen de Ridder, Delft University of Technology.

The notion of scale plays an important role in how we observe our environment. For instance, consider-ing a tree from a distance of a light-year or a nanometer is meaningless, but would make sense at a dis-tance of a few meters. Clearly, objects only exist as meaningful entities across a certain range of scales. The concept of scale is also omnipresent in genomics and molecular biology. Some cellular functions may involve a single direct protein interaction (small scale), whereas others require more indirect inter-actions, such as protein complexes (medium scale) and interactions between large modules of proteins (large scale). Other examples include the various levels at which the genome is organized within the cel-lular nucleus and the various levels at which the genome is modified by epigenetic marks. In this talk I will demonstrate the importance of incorporating scale in the analysis of large-scale molecular data. I will do this based on three examples. First of all, I will present the most comprehensive characterization of the genetic and epigenetic determinants of target site selection of genome-integrating elements to date. Our scale-aware analyses reveal that, at a genome-wide scale, biases are largely conserved across elements, whereas, at a small scale, the integration biases of different elements are driven by element-specific genomic features. I will also address the scale-aware analysis of cancer-causing retroviral inte-grations. These analyses demonstrate that, on top of the known clustering of insertions along the linear genome, clustering is also apparent at a different scale, which can only be revealed if the 3D confor-mation of the genome is taken into account. This knowledge has important implications for target gene determination and, more generally, provides new hypotheses on how (viral) enhancers act on their tar-gets. Finally, I will discuss how to characterize topological structure in protein interaction networks. We developed scale-aware versions of known graph topological measures based on diffusion kernels to characterize the topology of networks across all scales simultaneously, generating a so-called graph top-ological scale-space. We demonstrate that graph topological scale-spaces capture biologically meaning-ful features that provide new insights into the link between function and protein network architecture.

Dr. de Ridder did his PhD on pathway discovery in insertional mutagenesis data at the Delft Bioinformatics lab (Reinders group, TUD) and at the Netherlands Cancer Insitute (Wessels group). His PhD work resulted in a statistical framework for the analysis of retroviral insertional mutagenesis data to identify cancer genes in mice and was used in the analysis of several mutagenesis datasets in collaboration with researchers at the Netherlands Cancer Institute and the Wellcome Trust Sanger Institute (Hinxton, UK). During his PhD, Dr. de Ridder visited the Shmulevich lab at the Institute for Systems Biology (Seattle, USA), where he worked on multi-scale methods for genomic data analysis. In addition, Dr. de Ridder has been an active member of the national and international bioinformatics communities, first as member of the board of the Dutch Regional Student Group, co-initiated by NBIC, and later as the elected chair of the Student Council, an initiative of the International Society for Computational Biology (ISCB). Dr. de Ridder currently holds a position as Assistant Professor in the Delft Bioinformatics Group.

Learn more about him here.

Network biology: Large-scale data and text mining

Lars Juhl Jensen, The Novo Nordisk Foundation Center for Protein Research.

Methodological advances have in recent years given us unprecedented information on the molecular details of living cells. However, it remains a challenge bring together all the available data on individual genes to facilitate systems-level analyses of, for example, diseases. Networks have proven to be a very useful abstraction for bridging this gap between the single-gene and the systems level.
In my presentation I will describe the STRING database (, which scores and integrates evidence from a diverse range of curated databases, raw data repositories, automatic text mining, and computational prediction methods to provide the most comprehensive protein association network possible. I will also introduce a suite of three new web-based resources that use similar techniques to associate the proteins with cellular compartments (, tissues (, and diseases ( to enable systems biology studies of diseases, taking into account both the interactions and the spatial localization of the proteins.

Lars Juhl Jensen started his research career in Søren Brunak’s group at the Technical University of Denmark (DTU), from which he in 2002 received the Ph.D. degree in bioinformatics for his work on non-homology based protein function prediction. During this time, he also developed methods for visualization of microbial genomes, pattern recognition in promoter regions, and microarray analysis. From 2003 to 2008, he was at the European Molecular Biology Laboratory (EMBL) where he worked on literature mining, integration of large-scale experimental datasets, and analysis of biological interaction networks. Since the beginning of 2009, he has continued this line of research as a professor at the Novo Nordisk Foundation Center for Protein Research at the Panum Institute in Copenhagen and as a co-founder and scientific advisor of Intomics A/S. He is a co-author of more than 100 scientific publications that have in total received more than 7000 citations. He was awarded the Lundbeck Foundation Talent Prize in 2003, his work on cell-cycle research was named “Break-through of the Year” in 2006 by the magazine Ingeniøren, his work on text mining won the first prize in the “Elsevier Grand Challenge: Knowledge Enhancement in the Life Sciences” in 2009, and he was awarded the Lundbeck Foundation Prize for Young Scientists in 2010.

Learn more about him here.