Individuals with Autism Spectrum Conditions (ASC) show nonverbal communication deficits in different modalities, including poor vocal emotional expressivity. This may include odd or inappropriate intonation, rhythm, volume, and other prosodic features. This deficit may play a major role in peer rejection and bullying that children with ASC often experience. In recent years, a growing body of research in the field of computer-science has focused on the modeling and generation of affective speech, particularly in Typically Developing (TD) children. Our team has designed such affective speech analyses as part of the ASC-Inclusion project, an internet-based platform that will assist children with ASC to improve their socio-emotional communication skills. If computerized technology could characterize the vocal features hampering children with ASC's social communication, such technology could be harnessed for assessment and tailor-made interventions into affective speech, thus reducing the risk for peer rejection.
Objectives:
To evaluate the ability of computerized analysis algorithms to characterize the vocal features distinguishing the affective speech of children with ASC from that of TD children, and to highlight the vocal features that correlate with likeability and naturality ratings of human judges.
Methods:
In order to prepare stimuli for the computerized analysis, 9 children with ASC and 10 TD children were recorded expressing 9 different emotions and mental states (happy, sad, angry, afraid, surprised, ashamed, proud, calm and neutral) in speech, as part of the EC-FP7 funded ASC-Inclusion project. Overall, 552 stimuli were recorded, out if which 420 (210 from each group) were selected for analysis.
In order to examine the existence of group differences in the expression of affective speech and to mark the factors contributing to this distinction, a set of computer analyses has been conducted. The analyses included binary valence/arousal discrimination, a discrimination of each emotion against neutral, and analysis of additional prosodic features, relevant for discrimination.
Human judgment of the stimuli was conducted by 15 TD adults and 15 TD children (aged 5-10). Judges were asked to rate the utterances based on measures of likability and naturality.
Group differences for the computerized model and the human judgments, and correlations between the two were calculated.
Results:
Computerized analysis distinguished between groups over 82% of the stimuli, highlighting arousal and valance as meaningful discriminating factors. The computerized function was negatively correlated with human judgments of likeability and naturality.
Conclusions:
Children with ASC vocally express emotions in a way that is considered odd and unnatural by their TD peers, which may play a major role in their social exclusion. Computer analysis has been found sensitive to the differences between the ASC and TD groups and may be used in the future for vocal analysis and relevant intervention planning, in order to improve children with ASC’s social communication and peer acceptance. The computerized analysis used in this study will be integrated in ASC-Inclusion, an internet-based platform, aimed to teach children with ASC to express emotions and to recognize them.