Project Title
In silico reconstruction of Streptococcus pneumoniae cellular networks and their impact on virulence
Project Type
Nacional / Public
Funding Body
Funding Program
Call for Funding of Research and Development Projects in all Scientific Domains - 2009
  • CEB: 28 908,00
  • Total: 126 588,00
Fundação da Faculdade de Ciências (FFC/FC/UL) Instituto de Medicina Molecular (IMM/FM/UL) Universidade do Minho (UM)
External link

Principal Investigator

Team Members - CEB


Streptococcus pneumoniae is a major cause of life threatening infections, such as pneumonia and meningitis, and also of less severe infections such as sinusitis and middle-ear infections. Paradoxically, it is carried asymptomatically in the human nasopharynx. It is also an important model organism within the Streptococcus genus. Molecular typing of S. pneumoniae highlighted the heterogeneous behaviour of particular genetic lineages within this species (28). Virulence, the relative capacity of a pathogen to inflict damage to a susceptible host, is not equally distributed across the pneumococcal population. Indeed, it is associated with some branches of its highly diverse genetic space. Among the non-redundant pool of genes present in three pneumococcal sequenced strains (R6, TIGR4 and G54), only 57% of the genes are present in all three strains. Considering these large differences in gene content, it is possible to identify virulent lineages through the presence or absence of specific genes. Genomic technologies, as comparative genomic hybridization (CGH) in microarrays, enable the high-throughput screening for such virulence determining genes (14, 19). We propose to potentiate this large-scale search through its integration with a systems biology approach, using the metabolic and transcriptional networks of S. pneumoniae to search for network motifs associated with virulence. If the function of a gene impacts on virulence, we expect neighbour genes in the metabolic or transcriptional networks to have a similar influence, even if indirectly, by affecting the activity of the first gene. The search for network motifs is more powerful than the identification of single gene associations. First, the cumulative association of the motif can be significant although none of the constituting genes presents a sufficiently strong association by itself. Second, the motif localization within the network provides more insights into the mechanism underlying its impact in pathogenesis. Networks can also be used to map the interactions of virulence determinants, which may suggest explanations for conflicting findings between laboratory studies and the distribution of virulence genes in natural populations (2). A current approach to integrate genomic data with other information sources is gene set enrichment analysis (GSEA) (18, 23). In the latter, one first identifies a group of genes that are individually associated with virulence and then searches for motifs with more nodes included in that set than expected by chance. In this project we propose a different method that evaluates directly network motif significance. This can improve the method performance in the detection of associations between data sources. The project will start with the reconstruction of the metabolic and transcriptional networks of S. pneumoniae from database and literature sources. Both will be major achievements, useful for the study of virulence determinants, and for the vast community studying fundamental biology and pathogenesis of S. pneumoniae. Such an effort will constitute an important model for other streptococcal pathogens. Next, we will develop new methods to integrate network information with CGH and epidemiological data for a collection of pneumococcal strains. These will allow us to identify nodes or motifs significantly associated with virulence. Lastly, topological analysis of the virulence motifs and their insertion spots within the networks will answer questions about the typical structural properties of significant motifs. Study of network motifs has been a pivotal tool to understand the connection between complex network topology and cellular function (13). It is our strong believe that it can also be a fruitful tool in the clarification of virulence molecular mechanisms, where, to our knowledge, it has never been applied. The project capitalizes on efforts and data from previous projects (“Population based identification of pneumococcal virulence and colonization factors” from NIAID/NIH and “Population and genomic consequences of vaccination against Streptococcus pneumoniae” (PTDC/SAU-ESA/64888/2006)) and builds upon team member experience in metabolic network reconstruction (24, 25), gene expression analysis (7, 18, 27), CGH array analysis (19), ontotlogy and semantic similarity analysis (16, 17), pneumococcal molecular epidemiology (1, 2, 28), and non-linear correlation analysis in the integration of heterogeneous data (21, 22). Discovery of new virulence determinants and hypothesis for virulence factor interactions may shed a new light on our understanding of pneumococcal pathogenesis and lead to new therapeutic or vaccination strategies. This project also innovates through the introduction of a concept - the virulence associated network motifs - and the methodology to search for them. This methodology can be applied to other pathogens, broadening the scientific impact of the proposed workplan beyond the pneumococcal field.