Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.
David Sims Kate Ridout
SimsDavid.jpg Kate Ridout


Learning Outcomes

By the end of this module the student will be able to:

  1. Appreciate the processing and analysis steps that raw sequencing data go through in order to establish the presence or absence of genetic variants.
  2. To understand the limitations of the analysis, and the potential for results to change as technology, algorithms and databases improve.
  3. Interrogate major public data sources, e.g. of genomic annotations, protein sequences, known variants, sequence conservation, and be able to integrate with clinical data, to assess the pathogenic and clinical significance of genetic variants.
  4. Acquire relevant basic computational skills and understanding of computational methods for handling and analysing sequencing data for application in both diagnostic and research settings.
  5. Gain practical experience of processing sequencing data using bioinformatics pipelines through the Genomics England programme.
  6. Critically evaluate bioinformatics tools and public data sources.

Indicative Content

The main topic in this module is the computational analysis of DNA sequencing data (including whole genome sequencing, whole exome sequencing and targeted sequencing) to identify different types of genetic variants.

  • Introduction to command-line Linux
  • Processing short-read DNA sequencing data including raw sequence quality control, aligning reads to a reference genome and post-alignment quality control
  • Computational tools for identifying single nucleotide variants and short insertions and deletions (indels) e.g. GATK
  • Annotation of variants using established databases and in silico tools for pathogenicity evaluation, and familiarity with relevant analysis programs
  • Variant quality control and appreciation of the factors contributing to variant reliability
  • Filtering strategies to prioritize potentially causative variants, incorporating both clinical data and using publicly available control data sets
  • Calling large genomic alterations (structural and copy number variants) and interpreting them in a cancer context
  • Overview of tools and workflows to analyse RNAseq gene expression data
  • Overview of future strategies and technologies in cancer diagnostics