Objectives: The goals of this study are 1) to ascertain the reliability of real-time judgments of prosody expressing affect and pragmatic style; 2) to determine whether our complex automated measures of acoustic features can accurately identify different affects and styles; and 3) to explore the ability of these scores to distinguish TD subjects from subjects with ASD.
Methods: Responses for these two tasks testing affective and pragmatic prosody were scored by clinicians during examination, by six naïve listeners in a web-based, “blind”, perceptual experiment, and with automated objective measures of acoustic features:
(i) Pragmatic Style (use appropriate prosody talking to an adult or baby; adapted from Paul et al. 2005)
(ii) Affect (repeat a phrase with one of four affects)
During examination, clinicians immediately assessed the correctness of each response, yielding real-time scores.
In the perceptual experiment for Pragmatic Style, six judges listened to recordings of minimal pairs of responses and selected the infant-directed utterance. In the Affect experiment, judges listened to an utterance and selected the perceived affect from a list of four (angry, sad, scared, and happy), along with their confidence in their selection.
In the automated analysis, quantitative features based on pitch, energy, and spectral balance were computed from recordings of the children's responses and combined using multiple linear regression to create a single complex score for each utterance.
Results: A per-utterance and per-speaker analysis of both tasks revealed that the objective digital measures generally correlated with the consensus scores as well as the judges correlated with one another and with the consensus scores. The correlations of the object measures were also consistently better than the correlations of the real-time scores with the judges' scores.
In the Pragmatic Style task, both the consensus scores and the objective scores showed TD subjects significantly outperforming ASD subjects. These results in the real-time scores were not significant.
In the Affect task, consensus scores distinguished the TD group from the ASD group only for happiness and sadness. Both the objective and the real-time scores distinguished TD from ASD for happiness but not for sadness. Real-time scores found between-group differences in anger, which was not confirmed by the consensus scores, possibly illustrating bias in real-time scores.
Conclusions: The objective acoustic measures of affect and pragmatic style expression were comparable in reliability to “blind” consensus subjective scores and superior to real-time clinical judgments in terms of both accuracy and ability to distinguish between the two diagnostic groups.