A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders.

TitleA fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders.
Publication TypeJournal Article
Year of Publication2012
AuthorsCheung, YHim, Wang, G, Leal, SM, Wang, S
JournalGenet Epidemiol
Volume36
Issue7
Pagination675-85
Date Published2012 Nov
ISSN1098-2272
KeywordsAngiopoietin-like 4 Protein, Angiopoietin-like Proteins, Angiopoietins, Case-Control Studies, Data Interpretation, Statistical, Gene Frequency, Genetic Association Studies, Genetic Predisposition to Disease, Genetic Variation, High-Throughput Nucleotide Sequencing, Humans, Metabolism, Sequence Analysis, DNA, Texas, Triglycerides
Abstract

Next generation sequencing technology has enabled the paradigm shift in genetic association studies from the common disease/common variant to common disease/rare-variant hypothesis. Analyzing individual rare variants is known to be underpowered; therefore association methods have been developed that aggregate variants across a genetic region, which for exome sequencing is usually a gene. The foreseeable widespread use of whole genome sequencing poses new challenges in statistical analysis. It calls for new rare-variant association methods that are statistically powerful, robust against high levels of noise due to inclusion of noncausal variants, and yet computationally efficient. We propose a simple and powerful statistic that combines the disease-associated P-values of individual variants using a weight that is the inverse of the expected standard deviation of the allele frequencies under the null. This approach, dubbed as Sigma-P method, is extremely robust to the inclusion of a high proportion of noncausal variants and is also powerful when both detrimental and protective variants are present within a genetic region. The performance of the Sigma-P method was tested using simulated data based on realistic population demographic and disease models and its power was compared to several previously published methods. The results demonstrate that this method generally outperforms other rare-variant association methods over a wide range of models. Additionally, sequence data on the ANGPTL family of genes from the Dallas Heart Study were tested for associations with nine metabolic traits and both known and novel putative associations were uncovered using the Sigma-P method.

DOI10.1002/gepi.21662
Alternate JournalGenet. Epidemiol.
PubMed ID22865616
PubMed Central IDPMC6240912
Grant List1RC2HL102926 / HL / NHLBI NIH HHS / United States
1RC4MD005964 / MD / NIMHD NIH HHS / United States
RC2 HL102926 / HL / NHLBI NIH HHS / United States
RC4 MD005964 / MD / NIMHD NIH HHS / United States
RR19895 / RR / NCRR NIH HHS / United States
S10 RR019895 / RR / NCRR NIH HHS / United States
UM1 HG006493 / HG / NHGRI NIH HHS / United States