Objectives: To assess the frequency and distribution of de novo single nucleotide variants (SNVs) in ASD affected individuals and in their unaffected siblings; to determine if de novo SNVs carry risk for ASD; and to identify specific disease associated de novo SNVs.
Methods: Whole-exome sequencing was performed on 872 individuals in 224 families selected from the Simons Simplex Collection (SSC). These were made up of 200 quartet families (father, mother, probands with ASD and unaffected sibling) and 24 trio families (father, mother and proband). De novo variants were predicted from the sequencing data and confirmed by PCR and Sanger sequencing.
Results: We found that de novo, non-synonymous SNVs are significantly more common in probands than in unaffected siblings (p=0.01; OR=1.88; 95%CI: 1.08-3.28). This difference is more significant when we consider only those non-synonymous mutations present in brain-expressed genes (p=0.006; OR=2.15; CI: 1.10-4.20). In probands we estimate that at least 19% of all de novo SNVs, 41% of non-synonymous de novo SNVs in brain-expressed genes and 77% of nonsense/splice site mutations in brain-expressed genes carry risk for ASD. Based on the de novo mutation rate observed in unaffected siblings, we demonstrate that the observation of multiple independent de novo non-synonymous SNVs in the same brain-expressed gene among unrelated probands can reliably differentiate risk alleles from neutral substitutions. In the current study, among a total of 279 identified de novo coding mutations, there is only a single instance in probands, and none in siblings, in which two independent nonsense substitutions disrupt the same gene, SCN2A (Sodium Channel, Voltage-Gated, Type II, Alpha Subunit), a result that is unlikely by chance (p=0.01).
Conclusions: In simplex families de novo SNVs carry risk for ASD. This risk is most readily apparent for non-synonymous variants and in brain-expressed genes. Specific mutations can be associated with ASD by virtue of multiple observations from different samples in the same gene and this approach offers a clear route to identify multiple ASD risk-associated genes in larger cohorts.