Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes.

TitleLandscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes.
Publication TypeJournal Article
Year of Publication2020
AuthorsWang, Q, Pierce-Hoffman, E, Cummings, BB, Alföldi, J, Francioli, LC, Gauthier, LD, Hill, AJ, O'Donnell-Luria, AH, Karczewski, KJ, MacArthur, DG
Corporate AuthorsGenome Aggregation Database Production Team, Genome Aggregation Database Consortium
JournalNat Commun
Volume11
Issue1
Pagination2539
Date Published2020 05 27
ISSN2041-1723
KeywordsCpG Islands, Databases, Genetic, DNA Mutational Analysis, Exome, Genetic Variation, Genome, Human, Humans, Mutation
Abstract

Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.

DOI10.1038/s41467-019-12438-5
Alternate JournalNat Commun
PubMed ID32461613
PubMed Central IDPMC7253413
Grant ListUM1 HG008900 / HG / NHGRI NIH HHS / United States
K12 HD052896 / HD / NICHD NIH HHS / United States
CS/14/2/30841 / BH / British Heart Foundation / United Kingdom
MC_UP_1102/20 / MR / Medical Research Council / United Kingdom
U54 DK105566 / DK / NIDDK NIH HHS / United States
R01 GM104371 / GM / NIGMS NIH HHS / United States
F32 GM115208 / GM / NIGMS NIH HHS / United States