Objectives: Our goal was to find out whether the test results, independent from the actual diagnosis, could predict one of the four diagnoses, how accurate these predictions would be and how well the model could distinguish between groups.
Methods: We used Matlab software to create two models. For the naïve Bayesian classifier model we trained the machine on 90% of the data (standard test scores and diagnosis) to predict the remaining 10%. Using a 10-fold cross-validation of the analysis, we compared the accuracy of our predictions to the truth values for all four groups combined as well as between ASD groups. For the multinomial logistic regression model, we transformed the standard test scores into z-scores and used 10 training sets each of which contained 75% of the data to predict a diagnosis for the remaining 25%. We then created a confusion matrix to force the model to decide on a diagnosis, normalized the data, and averaged over all the means.
Results: The naïve Bayesian classifier showed 48% accuracy for a prediction of diagnosis when data from all four groups were combined, 82% accuracy for a binary prediction of autism or dyslexia, 56% accuracy for a binary prediction of autism or Asperger’s, and 56% accuracy for a binary prediction of autism and PDD-NOS (collapsed) or Asperger’s. The multinomial logistic regression model showed an average performance hovering around 45% of correct predictions for all four diagnostic groups.
Conclusions: The naïve Bayesian classifier fared well above chance for predictions of a diagnosis from reading and comprehension scores when data from all four diagnostic groups were combined and when data from the two “extreme” groups on the reading vs. comprehension spectrum, autism and dyslexia, were compared. The model, however, hovered around chance when it came to a prediction among the ASD subgroups. The more refined multinomial logistic regression model performed well above chance to predict an accurate diagnosis of either autism, Asperger’s, PDD-NOS, or dyslexia with a more even spread of predictive power between all four groups. Our results indicate that trends in academic performance, in our case reading and comprehension, are measurable and predictive of an ASD diagnosis. Future studies with similar models predicting a diagnosis based on test performance could be especially valuable when examining core ASD deficits, i.e. social cognition, communication, and language skills. Ultimately, results from these models could aid in the establishment of a clear definition of the phenotype and a systematic examination of a distinction between ASD subgroups.