skip to main content
Article

Audio-visual emotion recognition in adult attachment interview

Published: 02 November 2006 Publication History

Abstract

Automatic multimodal recognition of spontaneous affective expressions is a largely unexplored and challenging problem. In this paper, we explore audio-visual emotion recognition in a realistic human conversation setting - Adult Attachment Interview (AAI). Based on the assumption that facial expression and vocal expression be at the same coarse affective states, positive and negative emotion sequences are labeled according to Facial Action Coding System Emotion Codes. Facial texture in visual channel and prosody in audio channel are integrated in the framework of Adaboost multi-stream hidden Markov model (AMHMM) in which Adaboost learning scheme is used to build component HMM fusion. Our approach is evaluated in the preliminary AAI spontaneous emotion recognition experiments.

References

[1]
Pantic M., Rothkrantz, L. J. M., Toward an affect-sensitive multimodal human-computer interaction, Proceedings of the IEEE, Vol. 91, No. 9, Sept. 2003, 1370--1390.
[2]
Pantic, M., Sebe, N., Cohn, J. F. and Huang, T., Affective Multimodal Human-Computer Interaction, in Proc. ACM Int'l Conf. on Multimedia, November 2005, 669--676.
[3]
Picard, R. W., Affective Computing, MIT Press, Cambridge, 1997.
[4]
Roisman, G. I., Tsai, J. L., Chiang, K. S.(2004), The Emotional Integration of Childhood Experience: Physiological, Facial Expressive, and Self-reported Emotional Response During the Adult Attachment Interview, Developmental Psychology, Vol. 40, No. 5, 776--789.
[5]
Huang, D. (1999), Physiological, subjective, and behavioral Responses of Chinese American and European Americans during moments of peak emotional intensity, honor Bachelor thesis, Psychology, University of Minnesota.
[6]
Ekman, P., Friesen, W. V., Hager, J. C., Facial Action Coding System, published by A Human Face, 2002.
[7]
Cowie, R., Douglas-Cowie E. and Cox, C., Beyond emotion archetypes: Databases for emotion modelling using neural networks, 18(2005), 371--388.
[8]
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., & Schrööder, M. (2000). 'Feeltrace': an instrument for recording perceived emotion in real time. Proceedings of the ISCA Workshop on Speech and Emotion, 19--24.
[9]
Fragopanagos, F. and Taylor, J. G., Emotion recognition in human-computer interaction, Neural Networks, 18 (2005) 389--405.
[10]
Bartlett, M. S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., and Movellan, J.(2005), Recognizing Facial Expression: Machine Learning and Application to Spontaneous Behavior, IEEE CVPR'05.
[11]
Sebe, N., Lew, M. S., Cohen, I., Sun, Y., Gevers, T., Huang, T. S.(2004), Authentic Facial Expression Analysis, Int. Conf. on Automatic Face and Gesture Recognition.
[12]
Zeng, Z, Fu, Y., Roisman, G. I., Wen, Z., Hu, Y. and Huang, T. S., One-class classification on spontaneous facial expressions, Automatic Face and Gesture Recognition, 281--286, 2006.
[13]
Zeng, Z., Hu, Y., Liu, M., Fu, Y. and Huang, T. S., Training Combination Strategy of Multi-stream Fused Hidden Markov Model for Audio-visual Affect Recognition, in Proc. ACM Int'l Conf. on Multimedia, 2005.
[14]
Zeng, Z., Tu, J., Pianfetti, P., Liu, M., Zhang, T., et al., Audio-visual Affect Recognition through Multi-stream Fused HMM for HCI, Int. Conf. Computer Vision and Pattern Recognition. 2005: 967--972.
[15]
Song, M., Bu, J., Chen, C., and Li, N., Audio-visual based emotion recognition--A new approach, Int. Conf. Computer Vision and Pattern Recognition. 2004, 1020--1025.
[16]
Devillers, L., Vidrascu L. and Lamel L., Challenges in real-life emotion annotation and machine learning based detection, Neural Networks, 18(2005), 407--422.
[17]
Ekman, P. and Rosenberg, E. (Eds.), What the face reveals. NY: Oxford University, 1997.
[18]
Cohn, J. F. and Schmidt, K. L.(2004), The timing of Facial Motion in Posed and Spontaneous Smiles, International Journal of Wavelets, Multiresolution and Information Processing, 2, 1--12.
[19]
Litman, D. J. and Forbes-Riley, K., Predicting Student Emotions in Computer-Human Tutoring Dialogues. In Proc. of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), July 2004.
[20]
Cowie, R. and Cornelius, R., Describing the emotional states that are expressed in speech, Speech Communication, 40, 5--23, 2003.
[21]
Tao, H. and Huang, T. S., Explanation-based facial motion tracking using a piecewise Bezier volume deformation mode, IEEE CVPR'99, vol.1, pp. 611--617, 1999.
[22]
He, X., Yan, S., Hu, Y., and Zhang, H, Learning a Locality Preserving Subspace for Visual Recognition, Int. Conf. on Computer Vision, 2003.
[23]
Zeng, Z., Tu, J., Liu, M., Huang, T. S. and Pianfetti, B., Audio-visual Affect Recognition, IEEE Tran. Multimedia, in press.
[24]
Potamianos, G., Neti, C., Gravier, G., and Garg, A., Automatic Recognition of audio-visual speech: Recent progress and challenges, Proceedings of the IEEE, vol. 91, no. 9, Sep. 2003.
[25]
Bourlard, H. and Dupont, S., A new ASR approach based on independent processing and recombination of partial frequency bands, ICSLP 1996.
[26]
Okawa, S., Bocchieri, E. and Potamianos, A., Multi-band Speech Recognition in noisy environments, ICASSP, 1998, 641--644.
[27]
Garg, A., Potamianos, G., Neti, C. & Huang, T. S., Frame-dependent multi-stream reliability indicators for audio-visual speech recognition, ICASSP, 2003.
[28]
Paul A. Viola, Michael J. Jones: Robust Real-Time Face Detection. ICCV 2001.
[29]
Mehrabian, A., Communication without words, Psychol. Today, vol.2, no.4, 53--56, 1968.
[30]
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M. et al., Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information, Int. Conf. on Multimodal Interfaces, 205--211, 2004.
[31]
Devillers, L., Abrillan, S. and Martin, J., Representing Real-life Emotions in Audiovisual Data with Non Basic Emotional Patterns and Context Features, Int. Conf. on Affective Computing and Intelligent Interaction, 519--526.

Cited By

View all
  • (2024)Attach-SwiNet: Multimodal Attachment Style Classification Model Based on Non-Verbal SignalsIEEE Access10.1109/ACCESS.2024.339760812(79151-79165)Online publication date: 2024
  • (2021)Sentiment analysis of pets using deep learning technologies in artificial intelligence of things systemSoft Computing10.1007/s00500-021-06038-zOnline publication date: 5-Aug-2021
  • (2018)Literature Survey and DatasetsMultimodal Sentiment Analysis10.1007/978-3-319-95020-4_3(37-78)Online publication date: 25-Oct-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces
November 2006
404 pages
ISBN:159593541X
DOI:10.1145/1180995
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. affect recognition
  2. affective computing
  3. emotion recognition
  4. multimodal human-computer interaction

Qualifiers

  • Article

Conference

ICMI06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 23 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Attach-SwiNet: Multimodal Attachment Style Classification Model Based on Non-Verbal SignalsIEEE Access10.1109/ACCESS.2024.339760812(79151-79165)Online publication date: 2024
  • (2021)Sentiment analysis of pets using deep learning technologies in artificial intelligence of things systemSoft Computing10.1007/s00500-021-06038-zOnline publication date: 5-Aug-2021
  • (2018)Literature Survey and DatasetsMultimodal Sentiment Analysis10.1007/978-3-319-95020-4_3(37-78)Online publication date: 25-Oct-2018
  • (2015)A Review and Meta-Analysis of Multimodal Affect Detection SystemsACM Computing Surveys10.1145/268289947:3(1-36)Online publication date: 17-Feb-2015
  • (2014)Emotion Detection via Discriminant Laplacian EmbeddingUniversal Access in the Information Society10.1007/s10209-013-0312-513:1(23-31)Online publication date: 1-Mar-2014
  • (2012)Consistent but modestProceedings of the 14th ACM international conference on Multimodal interaction10.1145/2388676.2388686(31-38)Online publication date: 22-Oct-2012
  • (2012)Human emotion and cognition recognition from body language of the head using soft computing techniquesJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-012-0107-14:1(121-140)Online publication date: 28-Feb-2012
  • (2011)Audio visual emotion recognition based on triple-stream dynamic bayesian network modelsProceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I10.5555/2062780.2062847(609-618)Online publication date: 9-Oct-2011
  • (2011)Fusion of audio- and visual cues for real-life emotional human robot interactionProceedings of the 33rd international conference on Pattern recognition10.5555/2039976.2040018(346-355)Online publication date: 31-Aug-2011
  • (2011)Toward region- and action-aware second life clientsProceedings of the 2011 IEEE International Conference on Multimedia and Expo10.1109/ICME.2011.6012038(1-6)Online publication date: 11-Jul-2011
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media