20489
Additional Testing Shows High Performance of Machine Learning Classifiers and Supports Potential for Rapid, Mobile Autism Risk Detection

Friday, May 15, 2015: 11:30 AM-1:30 PM
Imperial Ballroom (Grand America Hotel)
D. Wall, Stanford University, Palo Alto, CA
Background:

In 2012, we published two pilot studies that used machine learning methods to construct and test the performance a small number of behaviors for autism risk detection. The outcome showed promise, however the studies were limited to archival data, relatively small numbers, an imbalance in the number of cases versus controls, and insufficient testing in less severe forms of autism.  We have recently completed two studies aimed at addressing these limitations.

Objectives:

The objective was to test the accuracy of our original machine learning classifiers for detection of autism risk in a larger collection of archived samples and in prospectively generated clinical data.  

Methods:

Archival data were gathered from the Simons Simplex Collection, Simons VIP, AGRE, and NDAR. Prospective clinical data were gathered at Boston Children’s Hospital during a clinician run IRB-approved study. We ran the “observation” based classifier (OBC), constructed from our original analysis of ADOS, on the archival samples and compared the output to clinical outcome. The caregiver-directed classifier (CDC), constructed from our original analysis of ADI-R data, was mobilized for use on iPads and made available to families in advance of the clinical team evaluation. The outcome from this mobile CDC was later compared to the best estimate clinical diagnosis.

Results: We assembled an independent collection of ADOS data from 2333 spectrum and 283 nonspectrum individuals. We tested OBC outcomes against the outcomes provided by the original and current ADOS algorithms, the best estimate clinical diagnosis, and the Comparison Score severity metric associated with ADOS-2. The OBC was highly statistically correlated with the ADOS (r = -0.8143) and ADOS-2 (r = -0.7793) and exhibited >97% sensitivity and >77% specificity in comparison to both ADOS algorithm scores. The correspondence to the best estimate clinical diagnosis was also high (accuracy = 97%), with sensitivity of 97% and specificity of 83%. Concomitantly, we recruited 222 participants from Boston Children’s Hospital; 69 (31%) were given a clinical diagnosis of ASD.  The sensitivity of the MCDC in detecting ASD was 90% and the specificity was 80%. 

Conclusions: Using a larger sample size, more controls and higher numbers of lower severity cases of autism, the performance of our classifiers drop expectedly below the initial levels seen in the pilot studies.  However, the performance against best estimate clinical outcomes remains high and support the potential for use of machine-learning tools to streamline the process of initial detection and triage of autism risk, opening the bottleneck for more children to receive critical care earlier in development.