22516
Linguistic Markers of Autism Spectrum Disorder: Classification Sensitivity and Specificity of Language Produced during Clinical Evaluations
Objectives: Using a machine-learning approach, determine whether features of natural language produced during the ADOS distinguish ASD from typically developing controls (TDC).
Methods: Thirty-two participants aged 6-14 years (18 ASD, 14 TDC), individually matched on sex ratio, age, and IQ) were administered the ADOS during a clinical evaluation. 20-minute segments were transcribed, and parent reports of social responsiveness were collected (SRS; Constantino et al., 2003).
Results: First, we applied naïve Bayes classification to word choice (words produced by children during the evaluation) to assess the sensitivity and specificity of diagnostic classification based on this variable. Weighted log-odds calculations with leave-one-out-cross-validation resulted in 14/18 children correctly classified as having ASD and 100% of TDC participants classified correctly. Receiver Operating Characteristic (ROC) analyses showed high sensitivity and specificity using this classification metric, with area under the curve=92%, CIs 82%-100%, p<.001 (Figure 1a). In addition, we observed that participants with ASD spoke significantly more slowly (reduced speaking rate is associated with less co-articulation; Figure 1b), used significantly fewer words per turn, and had significantly longer inter-turn gaps than TDC participants. To assess continuous relationships between linguistic variables and clinical phenotype, we conducted Pearson correlations: results revealed that speaking rate, rate of conversational turn-taking, and overall speaking time correlated with symptom severity as measured by the SRS (rs range from -.34 to -.44, all ps<.01) but not with IQ or age. These promising preliminary findings with a relatively small sample are consistent with the literature and highly suggestive of real effects.
Conclusions: Computational linguistics represents a promising new way to parse heterogeneity and aid in diagnostic classification of ASD. New data from more heterogeneous populations are currently being analyzed, with the goal of increasing sample size and classification power.