International Meeting for Autism Research (May 7 - 9, 2009): Implementation of Ontology Driven Data Integration in the National Database for Autism Research

Implementation of Ontology Driven Data Integration in the National Database for Autism Research

Friday, May 8, 2009
Northwest Hall (Chicago Hilton)
3:30 PM
L. Young , National Database for Autism Research, BIMAS/CBEL/DCB/CIT, National Institutes of Health, Bethesda, MD
S. W. Tu , Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
L. Tennakoon , Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
J. McNiece , National Database for Autism Research, BIRSS/ISL/DCB/CIT, National Institutes of Health, Bethesda, MD
D. Vismer , National Database for Autism Research, BIRSS/ISL/DCB/CIT, National Institutes of Health, Bethesda, MD
M. E. Martone , Department of Neurosciences, University of California, San Diego, San Diego, CA
A. K. Das , Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
M. J. McAuliffe , National Database for Autism Research, BIRSS/ISL/DCB/CIT, National Institutes of Health, Bethesda, MD
Background:   An autism ontology should document all terms relevant to the disorder; all relationships between these terms and with terms in other fields; and various assumptions and constraints used in the definition of autism endophenotypes.  A data integration system could then assign unique identifiers to the ontological terms such that the same set of unique identifiers could also be assigned to terms and data in multiple data sources.
Objectives:   Searches for autism information can be enhanced using both ontological relationships and reasoning.  Such searches would be linked to research data of many types (such as clinical assessments, genomics, and imaging) from many labs.  Representation of endophenotype could be standardized in the ontology, as well.  This will lead to endophenotype driven queries and data integration from an endophenotype catalog.
Methods:   The implementation uses the University of California San Diego (UCSD) Biomedical Informatics Research Network (BIRN) system to integrate data from multiple sources.  This system includes a data integration environment comprising ontology for semantic integration; a mapping of ontology to data sources;  a means to expose data sources to its grid; and middleware to manage federated queries and data extraction.
A draft of an autism ontology has been composed by a group at Stanford whose approach is to use ontologies and data models for querying and reasoning about phenotype.  Semantic Web standards and technologies are used to encode the ontology.
Results:   A proof of concept is presented here.  The United States National Institutes of Health (NIH) National Database for Autism Research (NDAR) has adopted the BIRN data integration environment and linked it to autism research data.  An NDAR user can log into the system and navigate to a list of endophenotypes.  Clicking on a phenotype sends a query to the system to return data for all individuals satisfying the rules defining the endophentoype.  The use of global unique identifiers for the individuals allows the user to also discover additional data such as genomics and imaging.
Conclusions:   Efforts such as these will lead to an understanding of the processes necessary to increase the size of study populations by combining data from multiple institutions.  This will lead to larger data sets and an increased likelihood of finding correlations between endophenotype and genetic variants or between endophenotype and variations in medical images.  The hope is that strong correlations may be found and subsequently used in the clinic to diagnose susceptibility to autism disorders.  This approach would be a faster, less expensive step in detecting autism susceptibility, leading to earlier intervention.

This research was supported by the Intramural Research Program of the NIH, Center for Information Technology (LY, MJM) and NIMH, NINDS, NICHD, NIEHS, and NIDCD (JM, DV);  NLM 1P41LM007885 (SWT, LT, AKD); and NINDS RO1NS058296, NCRR RR04050, and RR08605 (MEM).

See more of: Poster IV
See more of: Poster Presentations