International Meeting for Autism Research (May 7 - 9, 2009): Automated Acoustic Analysis of Affective and Pragmatic Prosody in ASD

Friday, May 8, 2009

Boulevard (Chicago Hilton)

E. T. Prud'hommeaux , Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR

J. P. H. van Santen , Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR

L. M. Black , Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR

Background: Autism Spectrum Disorders are associated with deficits in affective and pragmatic prosody. An examiner's evaluation of prosody for a particular affect or social situation in real time is subject to influence from external factors. The examiner is aware of the subject's current mood and has likely noted how the subject spontaneously expressed affect previously. A clinically trained examiner might also entertain a hypothesis about the subject's diagnosis. Such biases could be moderated by including scores from an automated analysis of acoustic features that yields results similar to those produced in a “blind” assessment.

Objectives: The goals of this study are 1) to ascertain the reliability of real-time judgments of prosody expressing affect and pragmatic style; 2) to determine whether our complex automated measures of acoustic features can accurately identify different affects and styles; and 3) to explore the ability of these scores to distinguish TD subjects from subjects with ASD.

Methods: Responses for these two tasks testing affective and pragmatic prosody were scored by clinicians during examination, by six naļve listeners in a web-based, “blind”, perceptual experiment, and with automated objective measures of acoustic features:

(i) Pragmatic Style (use appropriate prosody talking to an adult or baby; adapted from Paul et al. 2005)
(ii) Affect (repeat a phrase with one of four affects)

During examination, clinicians immediately assessed the correctness of each response, yielding real-time scores.

In the perceptual experiment for Pragmatic Style, six judges listened to recordings of minimal pairs of responses and selected the infant-directed utterance. In the Affect experiment, judges listened to an utterance and selected the perceived affect from a list of four (angry, sad, scared, and happy), along with their confidence in their selection.

In the automated analysis, quantitative features based on pitch, energy, and spectral balance were computed from recordings of the children's responses and combined using multiple linear regression to create a single complex score for each utterance.

Results: A per-utterance and per-speaker analysis of both tasks revealed that the objective digital measures generally correlated with the consensus scores as well as the judges correlated with one another and with the consensus scores. The correlations of the object measures were also consistently better than the correlations of the real-time scores with the judges' scores.

In the Pragmatic Style task, both the consensus scores and the objective scores showed TD subjects significantly outperforming ASD subjects. These results in the real-time scores were not significant.

In the Affect task, consensus scores distinguished the TD group from the ASD group only for happiness and sadness. Both the objective and the real-time scores distinguished TD from ASD for happiness but not for sadness. Real-time scores found between-group differences in anger, which was not confirmed by the consensus scores, possibly illustrating bias in real-time scores.

Conclusions: The objective acoustic measures of affect and pragmatic style expression were comparable in reliability to “blind” consensus subjective scores and superior to real-time clinical judgments in terms of both accuracy and ability to distinguish between the two diagnostic groups.

See more of: Innovative Technologies Demonstration Session
See more of: Technology