Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

The CBIG lab is an interdisciplinary research lab established and led by Professor Francesca Buffa with the goal of bringing together multi-omics and computational science to decode cancer complexity.


The CBIG lab aims to understand cancer as a system, where cancer cells evolve and interact within a complex and dynamic microenvironment.

We search for gene networks and ‘signatures’ that enable us to understand how normal cells rewire their molecular circuits to become cancer cells, and how they respond to microenvironment characteristics, such as low oxygen levels (hypoxia), or signalling from non-cancer cells. This helps us to understand cancer evolution, heterogeneity, and to predict therapeutic strategies that are most appropriate.

Key elements to this research are: machine learning, mathematical modelling, cutting edge multi-omics and functional screens.

Selected projects


  • microC: modelling the complexity of the tumour microenvironment

Building her expertise on mathematical modelling of cancer and gene networks, Prof. Buffa has been awarded a European Research Council programme to model cancer growth as a complex system of interacting cells.

Our newly developed framework combines Agent-Based Modelling, a powerful computational technique, with modelling of gene networks, lending itself naturally to model heterogeneous population of cells acting and evolving in a dynamic microenvironment. These models enable us to study cancer initiation, clonal competition, interactions of cancer cells with the host microenvironment, and response to different drugs and combinations.

Ultimately, we want to use these models to identify likely resistance mechanisms, and predict the most appropriate treatment for each patient.

MicroC framework.jpg

The components of microC are shown in the figure above. Currently we are using triple negative breast cancer (TNBC) as our first test case due to its general poor prognosis, no clear treatment pathway, and major role of the tumour microenvironment. We are developing a TNBC model, and evaluating its ability to predict combinations of druggable targets that help us tackle therapeutic resistance.

To train such models cutting edge omics and phenotypic data are needed. We are integrating data from multi-omic screens in cell lines and cancer cohorts, and CRISPR screens, produced in our and collaborating labs.


  • Transcriptomics to reconstruct cancer gene networks

We have been developing and applying gene network analysis approaches to infer cell signalling and transcription factors networks using transcriptomic data in clinical samples from hundreds to thousands of individuals.

This offers a tool to infer gene function ‘in context’, which complements information obtained from in-vitro (‘out of context’) perturbation experiments.

To reduce the number of false associations, we have been exploiting previously acquired knowledge to inform the network derivation. A particularly successful technique has been to ‘seed’ networks by starting from well-validated genes. Further genes are then recruited based on the association of their expression with the expression of the initial seeds, and the network is so expanded. Resampling techniques are used to ensure robustness.

 eltd1.jpgDerivation of an angiogenesis signature and discovery of ELTD1, a major player in both development and cancer angiogenesis. Masiero M. et al, Cancer Cell 24:229-41 (2013) Featured in: Highlights, American Association of Cancer Research, Cancer Res 73; 5299 (2013) (Link-To-Paper)

 We have also previously demonstrated that robust signatures can be extracted from these networks by selecting hub genes. These can be used to estimate the network activity in human clinical samples, and have provided biomarkers that have been validated in prospective studies.

 hypoxianet.jpgDerivation of a hypoxia related network (seeds in yellow, recruited genes in blue) conserved across multiple cancer types. Buffa FM et al, Br J Cancer 102:428-35 (2010) & Winter SC-Buffa FM [joint] et al, Cancer Res 67:3441-9 (2007). The hypoxia signatures extracted from these networks have been used in several clinical studies, including currently in the S:CORT collaborative network (


  • Machine learning to integrate multiple omics and develop robust biomarkers

A set of multiple factors and interactions underlie cancer progression and response to treatment. We have a longstanding expertise in exploiting machine learning techniques to integrate the multiple omic layers measured by us and collaborators in human cancer samples, and derive generalizable and robust signatures of specific biological phenotypes and clinical endpoints.  


 Integration of mutation, amplification, transcriptional, methylation and miRNA sequencing data in 15 cancer types, 7738 cancer/normal clinical samples, to build a pan-cancer network of the association between miRNA (grey nodes) and cancer hallmarks (color-coded nodes). Dhawan et al, Nature Comm, 9, 5228 (2018) (Link-To-Paper).


A validated miRNA prognostic signature in breast cancer estrogen receptor (ER) positive and negative cohorts. The signature was developed using penalized linear regression with nested leave one-out and cross-validation. The miRNA identified have been subsequently characterised in a number of follow-up studies. Cancer Res. 71:5635-45 (2011). Featured in: Key Paper Evaluation. Expert Rev Anticancer Ther 2012 Mar;12(3):323-30 (Link-To-Paper).


Integrated analysis of mutation, amplification and gene expression data in 10 cancer types, 6538 tumour/normal samples, revealed common amplification and over-expression, but infrequent mutation of metabolic genes. Top candidate drivers of metabolic disruption are show across cancer types (see manuscript for abbreviations), shade of blue indicates increasing fraction of samples with amplification and over-expression. Haider S et al, Genome Biol. 17(1):140 (2016) (Link-To-Paper).


  • Using gene signatures to infer the biological phenotype of clinical samples

To help bridging the gap between cancer cell lines models and cancer patients, we have been developing ways of exploiting gene signatures to generate and validate biological hypotheses in human clinical samples. These consist of a set of genes, whose collective expression is associated with a known phenotype or cancer hallmark. For example, we have used hypoxia, metabolism and angiogenesis signatures to study the temporal changes and treatment response of individual cancer samples to drugs targeting metabolism and angiogenesis (e.g. recently Lord SR et al, Cell Metabolism 28:679-688, 2018).

Whilst this task is challenging, the increasing availability of gene signatures is a resource which needs to be evaluated, and, if suitable, exploited to this aim. To facilitate this, we are developing protocols that allow to evaluate the applicability of previously developed gene signatures to newly acquired datasets.


 SigQC: a procedural approach for standardising the evaluation of gene signatures. The main elements of the protocol include indicators of gene expression, variability and data structure, and evaluation of different metrics to represent the signature information content (Dhawan A et al. Nature Protocols 2019) (link-to-paper)

We are generously funded by Cancer Research UK and the European Research Council.