Automatic Head-Nod Generation Using Utterance Text Considering Personality Traits

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 714))

493 Accesses

Abstract

We propose a model for generating head nods from an utterance text considering personality traits. We have been investigating the automatic generation of body motion, such as nodding, from utterance text in dialog agent systems. Human body motion varies greatly depending on personality. Therefore, it is important to appropriately generate body motion according to the personality of the dialog agent. To construct our model, we first compiled a Japanese corpus of 24 dialogues including utterance, nod information, and personality traits (Big Five) of participants. Our nod-generation model also estimates the presence, frequency, and depth during each phrase by using various types of language information extracted from utterance text and personality traits. We evaluated how well the model can generate and estimate nods based on individual personality traits. The results indicate that our model using language information and personality trails outperformed a model using only language information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatically Generating Head Nods with Linguistic Information

Head Motion Generation

References

Amos B, Ludwiczuk B, Satyanarayanan M (2016) Openface: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU School of Computer Science
Google Scholar
Beskow J, Granstrom B, House D (2006) Visual correlates to prominence in several expressive modes. In: INTERSPEECH
Google Scholar
BirdWhistell RL (1970) Kinesics and context. University of Pennsylvania Press
Google Scholar
Busso C, Deng Z, Grimm M, Neumann U, Narayanan S (2007) Rigid head motion in expressive speech animation: analysis and synthesis. In: IEEE transactions on audio, speech, and language processing, pp 1075–1086
Google Scholar
Fuchi T, Takagi S (1998) Japanese morphological analyzer using word cooccurrence -jtag. In: International conference on computational linguistics, pp 409–413
Google Scholar
Graf HP, Cosatto E, Strom V, Huang FJ (2002) Visual prosody: facial movements accompanying speech. In: IEEE international conference on automatic face and gesture recognition, pp 381–386
Google Scholar
Higashinaka R, Imamura K, Meguro T, Miyazaki C, Kobayashi N, Sugiyama H, Hirano T, Makino T, Matsuo Y (2014) Towards an open-domain conversational system fully based on natural language processing. In: International conference on computational linguistics, pp 928–939
Google Scholar
Ishi CT, Haas J, Wilbers FP, Ishiguro H, Hagita N (2007) Analysis of head motions and speech, and head motion control in an android. In: IEEE/RSJ international conference on intelligent robots and systems, pp 548–553
Google Scholar
Ishi CT, Ishiguro H, Hagita N (2010) Head motion during dialogue speech and nod timing control in humanoid robots. In: ACM/IEEE international conference on human-robot interaction, pp 293–300
Google Scholar
Ishii R, Katayama T, Higashinaka R, Tomita J (2018) Automatic generation of head nods using utterance texts. In: 2018 27th IEEE international symposium on robot and human interactive communication (RO-MAN), pp 1143–1149
Google Scholar
Ishii R, Higashinaka R, Nishida K, Katayama T, Kobayashi N, Tomita J (2018) Automatically generating head nods with linguistic information. In: Meiselwitz G (ed) Social computing and social media. Springer International Publishing, Cham, Technologies and analytics, pp 383–391
Google Scholar
Ishii R, Katayama T, Higashinaka R, Tomita J (2018) Automatic generation system of virtual agent’s motion using natural language. In: Proceedings of the 18th international conference on intelligent virtual agents, IVA ’18, New York, NY, USA, 2018. ACM, pp 357–358
Google Scholar
Ishii R, Katayama T, Higashinaka R, Tomita J (2018) Generating body motions using spoken language in dialogue. In: Intelligent virtual agents (IVA’18)
Google Scholar
Iwano Y, Kageyama S, Morikawa E, Nakazato S, Shirai K (1996) Analysis of head movements and its role in spoken dialogue. In: International conference on spoken language, pp 2167–2170
Google Scholar
Munhall KG, Jones JA, Callan DE, Kuratate T, Vatikiotis-Bateson E (2004) Visual prosody and speech intelligibility: head movement improves auditory speech perception 15(2):133–137
Google Scholar
Koiso H, Horiuchi Y, Tutiya S, Ichikawa A, Den Y (1998) An analysis of turn-taking and backchannels based on prosodic and syntactic features in japanese map task dialogs. Lang Speech 41:295–321
Article Google Scholar
Lohse M, Rothuis R, Gallego-Pérez J, Karreman DE, Evers V (2014) Robot gestures make difficult tasks easier: the impact of gestures on perceived workload and task performance. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’14, New York, NY, USA, 2014. ACM, pp 1459–1466
Google Scholar
McBreen HM, Jack MA (2001) Evaluating humanoid synthetic agents in e-retail applications. IEEE Trans Syst, Man, Cybern - Part A: Syst Humans 31:5
Article Google Scholar
Meguro T, Higashinaka R, Minami Y, Dohsaka K (2010) Controlling listening-oriented dialogue using partially observable markov decision processes. In: International conference on computational linguistics, pp 761–769
Google Scholar
Quinlan JR (1996) Improved use of continuous attributes in c4.5. J Artif Intell Res 4:77–90
Article Google Scholar
Watanabe T, Danbara R, Okubo M (2003) Effects of a speech-driven embodied interactive actor interactor on talker’s speech characteristics. In: IEEE international workshop on robot-human interactive communication, pp 211–216
Google Scholar
Wittenburg P, Brugman H, Russel A, Klassmann A, Sloetjes H (2006) Elan a professional framework for multimodality research. In: International conference on language resources and evaluation
Google Scholar
Yehia HC, Kuratate T, Vatikiotis-Bateson E (2002) Linking facial animation, head motion and speech acoustics 30(3):555–568
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Media Intelligence Laboratory, NTT Corporation, 1-1 Hikari-no-oka, Yokosuka-shi, Kanagawa, Japan
Ryo Ishii, Taichi Katayama, Ryuichiro Higashinaka & Junji Tomita

Authors

Ryo Ishii
View author publications
You can also search for this author in PubMed Google Scholar
Taichi Katayama
View author publications
You can also search for this author in PubMed Google Scholar
Ryuichiro Higashinaka
View author publications
You can also search for this author in PubMed Google Scholar
Junji Tomita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryo Ishii .

Editor information

Editors and Affiliations

Apple, Cupertino, CA, USA
Erik Marchi
Kore University of Enna, Enna, Italy
Sabato Marco Siniscalchi
Polytechnic University of Turin, Torino, Italy
Sandro Cumani
Kore University of Enna, Enna, Italy
Valerio Mario Salerno
National University of Singapore, Singapore, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ishii, R., Katayama, T., Higashinaka, R., Tomita, J. (2021). Automatic Head-Nod Generation Using Utterance Text Considering Personality Traits. In: Marchi, E., Siniscalchi, S.M., Cumani, S., Salerno, V.M., Li, H. (eds) Increasing Naturalness and Flexibility in Spoken Dialogue Interaction. Lecture Notes in Electrical Engineering, vol 714. Springer, Singapore. https://doi.org/10.1007/978-981-15-9323-9_26

Download citation

DOI: https://doi.org/10.1007/978-981-15-9323-9_26
Published: 11 March 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9322-2
Online ISBN: 978-981-15-9323-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Automatic Head-Nod Generation Using Utterance Text Considering Personality Traits

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automatically Generating Head Nods with Linguistic Information

Head Motion Generation

Head Motion Generation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Automatic Head-Nod Generation Using Utterance Text Considering Personality Traits

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automatically Generating Head Nods with Linguistic Information

Head Motion Generation

Head Motion Generation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation