25952
A Stratified Analysis of Subtypes in Autism Spectrum Disorders with Unsupervised Machine Learning

Friday, May 12, 2017: 12:00 PM-1:40 PM
Golden Gate Ballroom (Marriott Marquis Hotel)
E. Stevens1, D. Dixon2 and E. Linstead3, (1)Computer Science, Chapman University, Orange, CA, (2)Center for Autism and Related Disorders, Woodland Hills, CA, (3)Chapman University, Lakewood, CA
Background:  Increasing rates of diagnosis of Autism Spectrum Disorder (ASD) combined with high variance in abilities and disabilities of those diagnosed necessitate the need for empirically-based customization of treatment programs. Concurrently, the maturation of ubiquitous computing now allows for the large-scale and rapid collection of longitudinal clinical data which can be leveraged to train machine learning algorithms capable of detecting subtle patterns in large data repositories. Previously (https://imfar.confex.com/imfar/2016/webprogram/Paper22515.html) we demonstrated the application of unsupervised clustering models to extract behavioral subtypes of ASD based on 8 domains: cognition, executive functioning, language, motor, social, play, adaptive, and academic. These results showed that distinct skill profiles existed on the spectrum, and could serve as a basis for tuning therapy curriculum. Here we expand this work to track the stability of cluster profiles across age groups and genders, as well as model the response of patient learning in each cluster as a function of treatment intensity.

Objectives: The purpose of the present study was to identify if behavioral subtypes of ASD identified through machine learning remained stable as a function of age and gender. An additional goal of the study was to capture the relationship between learning outcomes and treatment intensity for each cluster in the age-stratified model.

Methods: Data from the SKILLS database was used to generate a 3500-dimensional vector model of patients with confirmed ASD diagnoses, with each vector element corresponding to the presence or absence of an individual behavior. These data were aggregated across the 8 domains enumerated above, resulting in an 8 dimensional feature space for approximately 2000 patients. This feature space was then stratified based on age and gender, and then clustered using Expectation Maximization with mixtures of Gaussians. Bayesian Information Criteria was used to determine the optimal number of clusters. The centroids of each cluster were then used to visualize cluster profiles for each age and gender group, as well as to measure for significance similarities and differences in clustering behavior across groups, including responsiveness to treatment intensity.

Results:  Results of this study further confirmed the existence of subtypes on the autism spectrum, and the ability of these subtypes to be identified using unsupervised machine learning techniques. In particular, this study identified cluster profiles that retain the same general characteristics across age groups and genders, while others appear to change as a function of age. Similar observations were made to responses in treatment intensity. In practice this provides a heuristic for customizing treatment regimens for individuals, as well as evolving those treatment regimens in an informed fashion as the individual gets older.

Conclusions: Findings indicate the existence of distinct behavioral subtypes within a large sample of children with ASD, as well as how those subtypes vary as a function of age and gender. As noted in the past, the significance of these subtypes may span beyond behavioral treatment. Further exploration of other distinctions between these groups (e.g., genetic, medical, environmental, etc.) are warranted and currently underway.