Friday, May 8, 2009
Northwest Hall (Chicago Hilton)
12:00 PM
Background:
Autism is a common neurodevelopmental disorder characterized by deficits in language, reciprocal social interaction and patterns of rigid compulsive behaviors. Twin and family studies indicate that autism has a predominantly genetic etiology, and the presence of broader autism phenotype features in first degree relatives further supports heritability within families. While much progress has been made in recently in identifying rare variants associated with autism, progress in identifying and validating common allele effects has been more difficult due to what are presumed to be small effect sizes for a genetically heterogeneous disorder. Recent advances include development of dense genome-wide SNP datasets for analysis of association and copy number variation; one such dataset is derived from the AGRE family collection genotyped on the Illumina 550k SNP platform.
Objectives:
The objectives of this study were first to conduct a detailed analysis of ancestry, and secondly to utilize a staged design to assess allelic association in the AGRE 550k Illumina genotype data.
Methods:
In order to characterize ancestry in the AGRE families, we selected a subset of 5,000 SNPs from the Illumina 550k panel to infer ancestry of founders (parents) using STRUCTURE with samples from the eleven HapMap 3 populations as positive controls. Association was assessed by randomly splitting the AGRE families into training and test sets of equal size, and stratified by ancestry using the STRUCTURE results. Association was assessed by TDT on training set Caucasians and all SNPs with p<0.001 were examined in the test set for replication purposes.
Results:
Ancestry was determined for 1,228 out of 1,233 parental samples, including 95 individuals whose self-reported ancestry was unknown. Using TDT to test for association in families of European ancestry, we found 863 SNPs with a p<0.001 in the training set. Of these, 54 were also significant in test set Caucasians at p<0.05. By limiting the findings to those SNPs in which the direction of association was the same, 22 SNPs remained. One of these SNPs, rs13112011 located at 4p15.2, remained statistically significant after a conservative Bonferroni correction (p = 3.94E-5; OR 0.37, CI: 0.25-0.53). The SNP was also associated in families of Latino ancestry, but not in the African American or Asian families in the combined dataset. Statistical association among Caucasians was confirmed using FBAT and significance was attained in both a broad “Spectrum” ASD sample as well as narrowly defined “strict” autism families. Furthermore, several other SNPs around this SNP both up- and downstream were statistically significant as well. Haplotype analysis showed that rs13112011 specifically tags one haplotype that is significantly associated. This marker is located in an intergenic region between two spliced transcripts that have not been annotated.
Conclusions:
Using a two stage design, we have analyzed the publicly-available AGRE 550k data and identified markers that are associated in both an initial and follow-up replication subsample. While multiple markers are replicated in the test dataset, we point to one region at 4q15.2 in particular that may warrant further study.