A genome-wide scan of 10 000 gene-centric variants and colorectal cancer risk.
Webb E., Broderick P., Lubbe S., Chandler I., Tomlinson I., Houlston RS.
Genome scans based on gene-centric single nucleotide polymorphisms (SNPs) have been proposed as an efficient approach to identify disease-causing variants that is complementary to scans based on tagging SNPs. Adopting this approach to identify low-penetrance susceptibility alleles for colorectal cancer (CRC) we analysed genotype data from 9109 gene-centric SNPs, 7014 of which were non-synonymous (nsSNPs), in 2873 cases and 2871 controls using Illumina iselect arrays. Overall the distribution of associations was not significantly different from the null. No SNP achieved globally significant association after correction for multiple testing (lowest P value 1.7 x 10(-4), rs727299). We then analysed the dataset incorporating information on the functional consequences of nsSNPs. We used results from the in silico algorithm PolyPhen as prior information to weight the association statistics, with weights estimated from the observed test statistics within predefined groups of SNPs. Incorporating this information did not, however, yield any further evidence of a specific association (lowest P value 2.2 x 10(-4), rs1133950). There was a strong relationship between effect size and SNPs predicted to be damaging (P=1.63 x 10(-5)), however, these variants which are most likely to impact on risk are rare (MAF<5%). Hence although the rationale for searching for low-penetrance cancer susceptibly alleles by conducting genome-wide scans of coding changes is strong, in practice it is likely that natural selection has rendered such alleles to be too rare to be detected by association studies of the size employed.