There has been enormous progress in our understanding of the genomic architecture of autism spectrum disorders (ASD), which we now know to include rare variation of major effect, that can be inherited or de novo in the patient, as well as common variation of very weak effect. Based on this genomic architecture, joint analysis of thousands of samples is an efficient approach to gene discovery. Moreover, analyzing such large samples with a diverse set of approaches (for example focusing on de novo or recessive variation) would be an important means to rapidly identify ASD genes.
The Autism Sequencing Consortium (ASC), which includes over 20 groups working on whole exome and whole genome sequencing in ASD, is designed to analyze prospectively shared data for ASD gene discovery.
The ASC developed a Memorandum of Understanding (MOU) for prospective data sharing that protects the contributing sites. In addition, the ASC developed Working Groups around: (1) Data Management and Processing (DMAP), (2) Statistical Analysis (SA), and, (3) Sequencing, with standing committees around (4) Samples and Phenotypes, and (5) Production and Deliverables. DMAP developed the ASC Bioinformatic Hub where all data resides and is analyzed to ensure that all analytical approaches are defined. We collect lists of individuals and sequence data from ASC data collection centers. Centers contribute raw sequence data (FASTQ) or aligned read files (BAM). For each sample we provide a FASTQ and a BAM file; for each dataset we provide a PED file, and list of called SNPs and indels in a variant file.
The ASC MOU was signed by all members. As of October 2012, 2600 exomes are on the Hub, with another 1200 exomes being uploaded now. The data consumes 81 terabytes of storage. Based on the exome sequencing studies going on to date, we expect to have over 20,000 exomes available for analysis within 3 years. A Variant Calling Subgroup produces optimal calls of the exomes for single nucleotide variation, indel, and, ultimately copy number variation (CNV). The SA working group is developing analytical approaches to the whole exome data.
Prospective data sharing is a means to rapidly identify ASD genes and accelerate research in the pathogenesis and treatment of ASD.
See more of: Genetic Factors in ASD
See more of: Biological Mechanisms