Objectives: To develop a suite of computational tools that will enable automated quantitative measures of key social communicative behaviors and engagement of young children engaged in dyadic social interactions.
Methods: We are collecting rich sensor data (high quality video and audio recordings, on-body sensing of electrodermal activity and movement) in toddlers aged 15-30 months engaged in a semi-structured play interaction with an adult examiner. The interaction itself consists of a series of presses for specific social communicative behaviors of interest. Data from seventy-four toddlers has been collected, with 24 participants assessed a second time approximately two months after their initial visit. We use the sensor data to develop computer algorithms to automatically detect and quantify individual social communicative behaviors (e.g., attention to people and objects, smiling, gestures, vocalizations) and to predict ratings of the child’s engagement in the interaction. We compare the performance of the automated measurement tools against human coding of these behaviors from video.
Results: Using commercially available software and hardware, as well as research prototypes, we have developed tools to automatically parse an interaction into its constituent parts, and to detect whether the child has made eye contact, smiled or vocalized within a given period of time. Using overhead cameras, we have developed algorithms to track the child and adult’s heads and the objects involved in the interaction and to determine when a child directs attention to objects or to the examiner (or shifts gaze between the two) during the interaction. Using a camera worn by the examiner, we have developed algorithms to detect when a child makes direct eye contact with the examiner. Finally, we developed algorithms that use the child’s speech/vocalization data to predict the examiner’s ratings of the child’s engagement in the interaction.
Conclusions: Our preliminary results suggest that it is possible to detect the building blocks of social engagement – visual attention, joint attention, affect, and vocalization – in an automated way using video and audio recordings. We believe these tools, if further developed, will enable researchers and clinicians to gather more objective repeatable measures of the treatment progress of children enrolled in early interventions, and to do it more efficiently at larger densities and time scales than current human-based observation and measurement.
See more of: Educational Symposia