01688nas a2200385 4500008004100000020002200041245004000063210004000103260002800143520060700171653001400778653002400792653001900816653002100835653002800856653001300884653001100897653003900908653002100947653001900968653001500987653004401002653002601046653001701072653001101089653002501100653002001125653001601145100001101161700001401172700001501186700001701201700001701218856006701235 2009 eng d a978-1-4244-4117-400aLearning to Make Facial Expressions0 aLearning to Make Facial Expressions aShanghaibIEEEc06/20093 a

This paper explores the process of self-guided learning of realistic facial expression production by a robotic head with 31 degrees of freedom. Facial motor parameters were learned using feedback from real-time facial expression recognition from video. The experiments show that the mapping of servos to expressions was learned in under one-hour of training time. We discuss how our work may help illuminate the computational study of how infants learn to make facial expressions.

10aActuators10aEmotion recognition10aface detection10aface recognition10afacial motor parameters10aFeedback10aHumans10alearning (artificial intelligence)10aMachine Learning10aMagnetic heads10aPediatrics10areal-time facial expression recognition10aRobot sensing systems10arobotic head10aRobots10aself-guided learning10aServomechanisms10aServomotors1 aWu, T.1 aButko, N.1 aRuvulo, P.1 aBartlett, M.1 aMovellan, J. uhttps://rubi.ucsd.edu/content/learning-make-facial-expressions01620nas a2200349 4500008004100000020002200041245006600063210006600129260003200195520052200227653001900749653001900768653002800787653003500815653001300850653003900863653001400902653002300916653002400939653001700963653001100980653003900991653002101030653000901051653002501060653001501085653001501100653003001115100001501145700001701160856009301177 2008 eng d a978-1-4244-2661-400aAutomatic cry detection in early childhood education settings0 aAutomatic cry detection in early childhood education settings aMonterey, CAbIEEEc08/20083 a

We present results on applying a novel machine learning approach for learning auditory moods in natural environments [1] to the problem of detecting crying episodes in preschool classrooms. The resulting system achieved levels of performance approaching that of human coders and also significantly outperformed previous approaches to this problem [2].

10aAcoustic noise10aauditory moods10aautomatic cry detection10abehavioural sciences computing10aDeafness10aearly childhood education settings10aeducation10aEducational robots10aEmotion recognition10ahuman coders10aHumans10alearning (artificial intelligence)10aMachine Learning10aMood10apreschool classrooms10aPrototypes10aRobustness10aWorking environment noise1 aRuvolo, P.1 aMovellan, J. uhttps://rubi.ucsd.edu/content/automatic-cry-detection-early-childhood-education-settings02225nas a2200373 4500008004100000020002200041245006700063210006300130260002900193520104300222653001901265653003001284653003801314653002801352653001901380653002101399653002201420653003801442653001101480653001901491653002001510653001701530653001301547653001901560653003001579653002001609653001501629653003601644653001901680653002001699100001801719700002501737856008901762 2008 eng d a978-1-4244-2153-400aA discriminative approach to frame-by-frame head pose tracking0 adiscriminative approach to framebyframe head pose tracking aAmsterdambIEEEc09/20083 a

We present a discriminative approach to frame-by-frame head pose tracking that is robust to a wide range of illuminations and facial appearances and that is inherently immune to accuracy drift. Most previous research on head pose tracking has been validated on test datasets spanning only a small (< 20) subjects under controlled illumination conditions on continuous video sequences. In contrast, the system presented in this paper was both trained and tested on a much larger database, GENKI, spanning tens of thousands of different subjects, illuminations, and geographical locations from images on the Web. Our pose estimator achieves accuracy of 5.82deg, 5.65deg, and 2.96deg root-mean-square (RMS) error for yaw, pitch, and roll, respectively. A set of 4000 images from this dataset, labeled for pose, was collected and released for use by the research community.

10aaccuracy drift10acontinuous video sequence10acontrolled illumination condition10adiscriminative approach10aface detection10aface recognition10afacial appearance10aframe-by-frame head pose tracking10aHumans10aImage analysis10aImage databases10aLaboratories10aLighting10aMagnetic heads10amean square error methods10apose estimation10aRobustness10aroot-mean-square error tracking10aSystem testing10aVideo sequences1 aWhitehill, J.1 aMovellan, Javier, R. uhttps://rubi.ucsd.edu/content/discriminative-approach-frame-frame-head-pose-tracking02735nas a2200409 4500008004100000020002200041245004400063210004400107260003200151520155400183653002501737653002501762653001801787653002101805653001901826653001901845653001201864653002801876653002901904653002701933653001501960653002301975653002701998653001102025653002202036653001802058653001702076653002502093653002402118653002502142653002602167100001402193700001402207700001702221700001702238856007002255 2008 eng d a978-1-4244-1646-200aVisual saliency model for robot cameras0 aVisual saliency model for robot cameras aPasadena, CAbIEEEc05/20083 a

Recent years have seen an explosion of research on the computational modeling of human visual attention in task free conditions, i.e., given an image predict where humans are likely to look. This area of research could potentially provide general purpose mechanisms for robots to orient their cameras. One difficulty is that most current models of visual saliency are computationally very expensive and not suited to real time implementations needed for robotic applications. Here we propose a fast approximation to a Bayesian model of visual saliency recently proposed in the literature. The approximation can run in real time on current computers at very little computational cost, leaving plenty of CPU cycles for other tasks. We empirically evaluate the saliency model in the domain of controlling saccades of a camera in social robotics situations. The goal was to orient a camera as quickly as possible toward human faces. We found that this simple general purpose saliency model doubled the success rate of the camera: it captured images of people 70% of the time, when compared to a 35% success rate when the camera was controlled using an open-loop scheme. After 3 saccades (camera movements), the robot was 96% likely to capture at least one person. The results suggest that visual saliency models may provide a useful front end for camera control in robotics applications.

10aApplication software10aapproximation theory10aBayes methods10aBayesian methods10aBayesian model10acamera control10aCameras10aCentral Processing Unit10aComputational efficiency10aComputational modeling10aExplosions10afast approximation10ahuman visual attention10aHumans10aOpen loop systems10arobot cameras10arobot vision10aRobot vision systems10arobotic application10atask free conditions10avisual saliency model1 aButko, N.1 aZhang, L.1 aCottrell, G.1 aMovellan, J. uhttps://rubi.ucsd.edu/content/visual-saliency-model-robot-cameras