Genomics Tool Allowing Data Aggregation Across Projects and Repositories In Autism Spectrum Disorder Research

Thursday, May 12, 2011
Elizabeth Ballroom E-F and Lirenta Foyer Level 2 (Manchester Grand Hyatt)
3:00 PM
S. I. Novikova1, D. Hall2, L. Tatarov3, M. McAuliffe4 and M. F. Huerta5, (1)National Institute of Mental Health, Rockville, MD, (2)National Institute of Mental Health (NIMH), Rockville, MD, United States, (3)NIH CIT , NIH Centers for Information Technology, Bethesda, MD, (4)CIT, NIH Center for Information Technology, Bethesda, MD, (5)The Office of Technology Development and Coordination , National Institute of Mental Health, Bethesda, MD
Background: The National Database for Autism Research (NDAR) was designed to help the autism spectrum disorder (ASD) research community to accelerate discoveries by facilitated scientific collaboration, communication and sharing of detailed research data. By combining datasets from multiple existing data resources, NDAR enables researchers to search and aggregate data from multiple projects across multiple data repositories. NDAR supports clinical, phenotypic, imaging and genomics data.  The first model of genomics data acquisition was based on Minimal Information about a Microarray Experiment (MIAME) format. Using this format, the NDAR team piloted data submission of raw genomics data from several investigators in July 2010. The results of the pilot experiment showed that the received genomics data was inconsistently defined across projects and required too much time for investigators to annotate for submission to NDAR.

Objectives: Based on the experience receiving genomics data, coupled with the need to provide access to genomics data from multiple different repositories, the NDAR team implemented a predefined set of parameters that would guarantee the consistency of raw experimental data, while simplifying the data definition for submission and aggregation across federated repositories such as the Autism Genetics Research Exchange, the NIH database for genotypes and phenotypes (dbGaP), the NIMH Genetics Repository, and the Simons Foundation Research Initiative, among others.

Methods:   After thorough analyses of functional genomics data acquisition and storage criteria, such as MIAME, MAGE, MINSEQE, etc., and review of the needs of the ASD research community, the NDAR team developed an interactive tool  that defines the relationship between samples and data files clearly and as simply as possible.

Results: The NDAR Genomics Tool standardizes the naming of data processing and analysis protocols, requires entering sufficient details and enforces unambiguous interpretation of the entered information. Scheduled for launch in December of 2010, in time for the January NDAR submission cycle, the tool will be used by ASD investigators to define their genomics data allowing data aggregation through NDAR across projects and data repositories.

Conclusions: The NDAR team will present at the IMFAR 2011 conference the conclusions from utilizing the NDAR Genomics Tool for submission into NDAR. Furthermore, we will update IMFAR attendees on the progress of refining and utilizing this tool in the process of establishing data federation with the Autism Genetic Research Exchange and dbGaP.

