Title | Variant association tools for quality control and analysis of large-scale sequence and genotyping array data. |
Publication Type | Journal Article |
Year of Publication | 2014 |
Authors | Wang, GT, Peng, B, Leal, SM |
Journal | Am J Hum Genet |
Volume | 94 |
Issue | 5 |
Pagination | 770-83 |
Date Published | 2014 May 01 |
ISSN | 1537-6605 |
Keywords | Genetic Association Studies, Genetic Variation, Genotyping Techniques, High-Throughput Nucleotide Sequencing, Humans, Multifactorial Inheritance, Quality Control, Software |
Abstract | Currently there is great interest in detecting associations between complex traits and rare variants. In this report, we describe Variant Association Tools (VAT) and the VAT pipeline, which implements best practices for rare-variant association studies. Highlights of VAT include variant-site and call-level quality control (QC), summary statistics, phenotype- and genotype-based sample selection, variant annotation, selection of variants for association analysis, and a collection of rare-variant association methods for analyzing qualitative and quantitative traits. The association testing framework for VAT is regression based, which readily allows for flexible construction of association models with multiple covariates and weighting themes based on allele frequencies or predicted functionality. Additionally, pathway analyses, conditional analyses, and analyses of gene-gene and gene-environment interactions can be performed. VAT is capable of rapidly scanning through data by using multi-process computation, adaptive permutation, and simultaneously conducting association analysis via multiple methods. Results are available in text or graphic file formats and additionally can be output to relational databases for further annotation and filtering. An interface to R language also facilitates user implementation of novel association methods. The VAT's data QC and association-analysis pipeline can be applied to sequence, imputed, and genotyping array, e.g., "exome chip," data, providing a reliable and reproducible computational environment in which to analyze small- to large-scale studies with data from the latest genotyping and sequencing technologies. Application of the VAT pipeline is demonstrated through analysis of data from the 1000 Genomes project. |
DOI | 10.1016/j.ajhg.2014.04.004 |
Alternate Journal | Am. J. Hum. Genet. |
PubMed ID | 24791902 |
PubMed Central ID | PMC4067555 |
Grant List | U54 HG006493 / HG / NHGRI NIH HHS / United States UC2 HL102926 / HL / NHLBI NIH HHS / United States P30 CA016672 / CA / NCI NIH HHS / United States RC2 HL102926 / HL / NHLBI NIH HHS / United States RC4 MD005964 / MD / NIMHD NIH HHS / United States R01 HG005859 / HG / NHGRI NIH HHS / United States 1R01HG005859 / HG / NHGRI NIH HHS / United States HG006493 / HG / NHGRI NIH HHS / United States HL102926 / HL / NHLBI NIH HHS / United States UM1 HG006493 / HG / NHGRI NIH HHS / United States MD005964 / MD / NIMHD NIH HHS / United States |