International Meeting for Autism Research: SFARI Base: An Adaptable Informatics Infrastructure for the Simons Simplex Collection

SFARI Base: An Adaptable Informatics Infrastructure for the Simons Simplex Collection

Thursday, May 12, 2011
Elizabeth Ballroom E-F and Lirenta Foyer Level 2 (Manchester Grand Hyatt)
1:00 PM
S. B. Johnson1, L. Rozenblit2 and D. Voccola3, (1)Biomedical Informatics, Columbia University, New York, NY, (2)Research Informatics Services, Prometheus Research, LLC, New Haven, CT, (3)Prometheus Research, LLC, New Haven, CT
Background: The goals of the Simons Simplex Collection (SSC) are to acquire the largest sample to date of simplex families with idiopathic autism in a highly compressed time-frame, maintain the highest data quality standards and disseminate data and biospecimens to the research community efficiently.  Long-term requirements include support for additional studies and integration with partner systems such as the National Database for Autism Research (NDAR) and the Interactive Autism Network (IAN). The Simons Foundation partnered with a software vendor (Prometheus Research, LLC) to develop a distributed, Web-based, information system called SFARI Base.

Objectives: The purpose of SFARI Base is to support management of scientific data and materials generated by studies advancing the Simons Foundation Autism Research Initiative (SFARI). The information system must (1) acquire data via large-scale, multi-site, multi-modal clinical studies, and integrate these with results from laboratory studies of biomaterials; (2) curate data via data quality processes, controlled compilation of data releases, and inventory control for biomaterials; and (3) disseminate study data and materials via data exploration and advanced querying. Additional goals include the ability to add new studies, types of data, and functions at diminishing marginal cost and adapt protocols for ongoing studies at costs proportional to the amount of change. These requirements demand a flexible, extensible infrastructure.

Methods: SFARI Base employs a distributed architecture, in which clinical sites use software (SFARI Outpost) to locally manage studies, define protocols, screen families, enroll participants and enter data. SFARI Outpost de-identifies the data and transmits them to a central repository. Data-quality consultants at the University of Michigan use validation tools to review submitted families and help sites identify and repair problems. Researchers access curated data and biospecimens through a Web interface (base.sfari.org). The process involves assigning privileges for laboratory staff, identifying a set of families of interest, describing a research project, requesting data or specimens and providing institutional approvals. Additional functions enable investigators to (1) submit new data generated from analysis of the collection, establishing a growing pool of knowledge, (2) re-contact families to initiate new studies, and (3) integrate data on participants to other autism collections such as NDAR using global unique identifiers (GUID).

Results: By July 2011, the SSC will have accrued over 2,500 families. At present, more than 65 different research groups have made over 250 requests for SSC data or materials, and nearly 120,000 DNA samples have been shipped. Results from whole genome scans performed using both Illumina and Nimblegen chips are available for nearly 1,000 families, with results from additional families and analysis types expected in 2011.

Conclusions: The SSC imposed ambitious requirements of large volume, high quality and compressed timeframe, requiring the development of innovative technologies and procedures. SFARI made a substantial investment in infrastructure to deliver a system that was adaptable in the face of rapid change and established a governance structure to respond to evolving needs. SFARI Base provides support for new studies, new data types, and new functions at a relatively low cost and a rapid timeline.

See more of: Genetics and Genomics
See more of: Genetics
See more of: Biological Mechanisms
| More