Enabling whole genome sequencing analysis from FFPE specimens in clinical oncology.
Domenico D., Gundem G., Levine MF., Arango-Ossa JE., Robbe P., Asimomitis G., Cobbs C., Stockfisch E., O'Donohue T., Brierley C., Senz J., Cochrane D., Mohibullah N., Bhanot U., Silber J., Shukla N., Shah SP., Weigelt B., Zivanovic O., McPherson A., Schuh A., Kung AL., Papaemmanuil E.
The adoption of whole genome sequencing (WGS) in clinical oncology is challenged by low data quality and increased artifacts in standard-of-care formalin-fixed paraffin-embedded (FFPE) samples. Analysis of 56 fresh frozen (FF) and FFPE matched pairs demonstrates that FFPE processing results in a median 20-fold enrichment in artifactual calls across mutation classes and impairs detection of clinically relevant biomarkers such as homologous recombination deficiency (HRD). We demonstrate that implementation of consensus calling reduces artifactual structural variant (SV) calls by 98% but is not sufficient in mitigating artifactual calls for single nucleotide variants (SNVs) and indels as compared to FF data. We develop FFPErase, a machine learning framework that filters SNV/indel artifacts and delivers clinical grade variant reporting allowing accurate quantification of clinically relevant biomarkers. Comparison of FFPErase WGS calls to clinical reporting by FDA-approved panel tests demonstrates 99% sensitivity and enables reporting of 24% more clinically relevant findings.

