International Meeting for Autism Research: Automatic Classification of Parent-Infant Social Games From Videos

Automatic Classification of Parent-Infant Social Games From Videos

Friday, May 21, 2010
Franklin Hall B Level 4 (Philadelphia Marriott Downtown)
11:00 AM
P. Wang , School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA
T. L. Westeyn , GVU, School of Interactive Computing, College of Computing, Georgia Institute of Technology, Atlanta, GA
G. D. Abowd , School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA
J. Rehg , School of Interactive Computing, College of Computing, Georgia Tech, Atlanta, GA
Background: Parent-infant social games, such as peak-a-boo and patty-cake, play important roles in the early detection of autism spectrum disorders. When studying an infant’s social ability, psychologists assess his social behaviors by in situ observation or behavior analysis in home movies. Current approaches on video-based behavior assessment rely on manual searching of relevant behaviors and manual scoring of the behaviors, such as the frequency of playing peek-a-boo, the diversity of social interactive gestures a child can perform [Colgan et al. 2006]. This procedure is time-consuming and labor-intensive. We aim at developing computer vision techniques to automate video filtering and behavior coding.

Objectives: Develop algorithms that automatically classify social games from unstructured videos. It has the potential to automatically count the types of games a child can play, and their frequencies, and summarize the trajectories a child behaves in the intervention programs.


Social games are characterized as repetitions of the dyadic interactions, with a range of permissible variations. In the previous work on automatic retrieval of social games from unstructured videos, social games are modeled as quasi-periodic events in videos and the detection of quasi-periodic patterns indicate the existence of social games. In this work, we use the extracted patterns as training examples, and apply Support Vector Machine (SVM) to recognize different types of games. Each pattern is represented by the histogram of the visual words that belong to its ith occurrence. Our method of collecting examples has two advantages. First, various sequential stages of a game are automatically collected without any human annotations; Second, one video of a social game gives many training examples with class labels, which enables fast collection of a large corpus training/testing data, a essential element in supervised learning methods.


We have collected two video datasets: 1) about 40 minutes of 5 adult dyads playing toss-the-ball, roll-the-ball and pattycake games in different environments; 2) 85 minutes of 3 parent-child dyads playing freely in a laboratory setting (other games in addition to the above three games are played). 2/3 of dataset 1 is used for training, and the rest 1/3 data is used for testing. Our classifier achieves a recognition rate of 94.44% for 18 pattycake sequences, 81.25% for 16 toss-the-ball sequences, and 92.31% for 13 roll-the-ball sequences. We then apply the learned SVM classifier to dataset 2. It achieves an average recognition rate of 61.41% on the parent-child play examples. Our future work includes expanding our game collection, and refining the representation of the pattern to increase its discriminative power against other games.


We have presented a method to describe the quasi-periodic patterns that are automatically extracted from social games, and to build a game classifier with the representations. Social games, as the earliest form of social interactions in infancy, constitute a rich source of behavioral data that is not only useful for psychologists, but also amenable for computer vision analysis. Our work continues demonstrating the potential impact that computer vision techniques may impose on behavioral science.