skip to main content

KinVoices: Using Voices of Friends and Family in Voice Interfaces

Published: 18 October 2021 Publication History


With voice user interfaces (VUIs) becoming ubiquitous and speech synthesis technology maturing, it is possible to synthesise voices to resemble our friends and relatives (which we will collectively call 'kin') and use them on VUIs. However, designing such interfaces and investigating how the familiarity of kin voices affect user perceptions remain under-explored. Our surveys and interviews with 25 users revealed that VUIs using kin voices were perceived as more engaging, persuasive and safer yet eerier than VUIs using common virtual assistant voices. We then developed a technology probe, KinVoice, an Alexa-based VUI that was deployed in three households over two weeks. Users set reminders using KinVoice, which in turn, gave the reminders in synthesised kin voices. This was to explore users' needs, uncover challenges involved and inspire new applications. We discuss design guidelines for integrating familiar kin voices into VUIs, applications that benefit from its usage, and implications for balancing voice realism and usability with security and diversification.

Supplementary Material

ZIP File (
Supplementary Materials for "KinVoices: Using Voices of Friends and Family in Voice Interfaces"
MP4 File (v5cscw446vf.mp4)
Supplemental video


Amal Abdulrahman, Deborah Richards, and Ayse Aysin Bilgin. 2019. A Comparison of Human and Machine-Generated Voice. In 25th ACM Symposium on Virtual Reality Software and Technology (Parramatta, NSW, Australia) (VRST '19). Association for Computing Machinery, New York, NY, USA, Article 41, 2 pages.
David Airehrour, Samaneh Madanian, and Alwin Mathew Abraham. 2018. Designing a memory-aid and reminder system for dementia patients and older adults. Proceedings of the 17th International Conference on INFORMATICS in ECONOMY (2018), 75--81.
Matthew P. Aylett, Selina Jeanne Sutton, and Yolanda Vazquez-Alvarez. 2019. The Right Kind of Unnatural: Designing a Robot Voice. In Proceedings of the 1st International Conference on Conversational User Interfaces (Dublin, Ireland) (CUI '19). Association for Computing Machinery, New York, NY, USA, Article 25, 2 pages.
Matthew P. Aylett and Yolanda Vazquez-Alvarez. 2020. Voice Puppetry: Speech Synthesis Adventures in Human Centred AI. In Proceedings of the 25th International Conference on Intelligent User Interfaces Companion (Cagliari, Italy) (IUI '20). Association for Computing Machinery, New York, NY, USA, 108--109.
Christoph Bartneck, Dana Kulić, Elizabeth Croft, and Susana Zoghbi. 2009. Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International journal of social robotics, Vol. 1, 1 (2009), 71--81.
Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottridge. 2018. Understanding the Long-Term Use of Smart Speaker Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 2, 3, Article 91 (Sept. 2018), 24 pages.
Saurabh Bhatia and Scott McCrickard. 2006. Listening to Your Inner Voices: Investigating Means for Voice Notifications. Association for Computing Machinery, New York, NY, USA, 1173--1176.
Virginia Braun, Victoria Clarke, Nikki Hayfield, and Gareth Terry. 2018. Thematic Analysis. Springer Singapore, Singapore, 1--18.
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).
Joao Paulo Cabral, Benjamin R. Cowan, Katja Zibrek, and Rachel McDonnell. 2017. The Influence of Synthetic Voice on the Evaluation of a Virtual Character. In INTERSPEECH. 229--233.
Julia Cambre, Jessica Colnago, Jim Maddock, Janice Tsai, and Jofish Kaye. 2020 a. Choice of Voices: A Large-Scale Evaluation of Text-to-Speech Voice Quality for Long-Form Content. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--13.
Julia Cambre and Chinmay Kulkarni. 2019. One Voice Fits All? Social Implications and Research Challenges of Designing Voices for Smart Devices. Proc. ACM Hum.-Comput. Interact., Vol. 3, CSCW, Article 223 (Nov. 2019), 19 pages.
Julia Cambre, Samantha Reig, Queenie Kravitz, and Chinmay Kulkarni. 2020 b. "All Rise for the AI Director": Eliciting Possible Futures of Voice Technology through Story Completion. In Proceedings of the 2020 ACM Designing Interactive Systems Conference (Eindhoven, Netherlands) (DIS '20). Association for Computing Machinery, New York, NY, USA, 2051--2064.
Samantha W. T. Chan, Shardul Sapkota, Rebecca Mathews, Haimo Zhang, and Suranga Nanayakkara. 2020. Prompto: Investigating Receptivity to Prompts Based on Cognitive Load from Memory Training Conversational Agent. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 4, 4, Article 121 (Dec. 2020), 23 pages.
Samantha W. T. Chan, Haimo Zhang, and Suranga Nanayakkara. 2019. Prospero: A Personal Wearable Memory Coach. In Proceedings of the 10th Augmented Human International Conference 2019 (Reims, France) (AH2019). Association for Computing Machinery, New York, NY, USA, Article 26, 5 pages.
Fermin Chavez-Sanchez, Gloria Adriana Mendoza Franco, Gloria Angelica Mart'inez de la Pe na, and Erick Iroel Heredia Carrillo. 2020. Beyond What is Said: Looking for Foundational Principles in VUI Design. In Proceedings of the 2nd Conference on Conversational User Interfaces (Bilbao, Spain) (CUI '20). Association for Computing Machinery, New York, NY, USA, Article 28, 3 pages.
Emna Chérif and Jean-Francc ois Lemoine. 2017. Human vs. synthetic recommendation agents' voice: The effects on consumer reactions. In Marketing at the Confluence between Entertainment and Analytics. Springer, 301--310.
Emna Chérif and Jean-Francc ois Lemoine. 2019. Anthropomorphic virtual assistants and the reactions of Internet users: An experiment on the assistant's voice. Recherche et Applications en Marketing (English Edition), Vol. 34, 1 (2019), 28--47.
Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Justin Edwards, Brendan Spillane, Emer Gilmartin, Christine Murad, Cosmin Munteanu, Vincent Wade, and Benjamin R. Cowan. 2019. What Makes a Good Conversation? Challenges in Designing Truly Conversational Agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--12.
Benjamin R. Cowan, Holly P Branigan, Mateo Obregón, Enas Bugis, and Russell Beale. 2015. Voice anthropomorphism, interlocutor modelling and alignment effects on syntactic choices in human- computer dialogue. International Journal of Human-Computer Studies, Vol. 83 (2015), 27--42.
Paul Dourish. 1996. Book Review - The Media Equation: How People Treat Computers, Television and New Media Like Real People and Places. Retrieved 2021-04--15 from
Philip R. Doyle, Justin Edwards, Odile Dumbleton, Leigh Clark, and Benjamin R. Cowan. 2019. Mapping Perceptions of Humanness in Intelligent Personal Assistant Interaction. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services (Taipei, Taiwan) (MobileHCI '19). Association for Computing Machinery, New York, NY, USA, Article 5, 12 pages.
Mateusz Dubiel, Martin Halvey, and Leif Azzopardi. 2018. A Survey Investigating Usage of Virtual Personal Assistants. arXiv preprint arXiv:1807.04606 (2018).
e-pill LLC. 2020. 25 Alarm Clock Reminder Rosie Reminder. Retrieved 2020-09-01 from
Carrie Demmans Epp, Cosmin Munteanu, Benett Axtell, Keerthika Ravinthiran, Yomna Aly, and Elman Mansimov. 2017. Finger Tracking: Facilitating Non-Commercial Content Production for Mobile e-Reading Applications. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services (Vienna, Austria) (MobileHCI '17). Association for Computing Machinery, New York, NY, USA, Article 34, 15 pages.
Brian A. Esterling, Michael H. Antoni, Mary Ann Fletcher, Scott Margulies, and Neil Schneiderman. 1994. Emotional disclosure through writing or speaking modulates latent Epstein-Barr virus antibody titers. Journal of consulting and clinical psychology, Vol. 62, 1 (1994), 130.
Friederike Eyssel, Laura De Ruiter, Dieta Kuchenbrandt, Simon Bobinger, and Frank Hegel. 2012. ?If you sound like me, you must be more human': On the interplay of robot and user features on human-robot acceptance and anthropomorphism. In 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 125--126.
Kerstin Fischer, Oliver Niebuhr, Lars C. Jensen, and Leon Bodenhagen. 2019. Speech Melody Matters-How Robots Profit from Using Charismatic Speech. ACM Transactions on Human-Robot Interaction (THRI), Vol. 9, 1 (2019), 1--21.
Loretta M. Flaherty. 2004. Personal sensory reminder with customizable voice message. US Patent 6,707,383.
Andrew Gambino, Jesse Fox, and Rabindra A. Ratan. 2020. Building a stronger CASA: extending the computers are social actors paradigm. Human-Machine Communication, Vol. 1, 1 (2020), 5.
Li Gong and Jennifer Lai. 2003. To mix or not to mix synthetic speech and human speech? Contrasting impact on judge-rated task performance versus self-rated performance and attitudinal responses. International Journal of Speech Technology, Vol. 6, 2 (2003), 123--131.
Jennica Grimshaw, Tiago Bione, and Walcir Cardoso. 2018. Who's got talent? Comparing TTS systems for comprehensibility, naturalness, and intelligibility. Future-proof CALL: language learning as exploration and encounters--short papers from EUROCALL (2018), 83--88.
Randy Allen Harris. 2004. Voice interaction design: crafting the new conversational speech systems. Elsevier.
Chin-Chang Ho and Karl F. MacDorman. 2010. Revisiting the uncanny valley theory: Developing and validating an alternative to the Godspeed indices. Computers in Human Behavior, Vol. 26, 6 (2010), 1508--1518. Online Interactivity: Role of Technology in Behavior Change.
Hilary Hutchinson, Wendy Mackay, Bo Westerlund, Benjamin B Bederson, Allison Druin, Catherine Plaisant, Michel Beaudouin-Lafon, Stéphane Conversy, Helen Evans, Heiko Hansen, et al. 2003. Technology probes: inspiring design for and with families. In Proceedings of the SIGCHI conference on Human factors in computing systems. 17--24.
Apple Inc. 2011. Siri. Retrieved 2020-09-01 from
[37] Inc. 2014. Alexa. Retrieved 2020-09-01 from
S. Jawaid and Rachel Mccrindle. 2016. Computerised help information and interaction project for people with memory loss and mild dementia. Journal of Pain Manage (2016), 269--272.
Corentin Jemine. 2019 a. Master thesis: Automatic Multispeaker Voice Cloning. (2019).
Corentin Jemine. 2019 b. Real-Time-Voice-Cloning.
Ye Jia, Yu Zhang, Ron Weiss, Quan Wang, Jonathan Shen, Fei Ren, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu, et al. 2018. Transfer learning from speaker verification to multispeaker text-to-speech synthesis. In Advances in neural information processing systems. 4480--4490.
Alexandra König, Aarti Malhotra, Jesse Hoey, and Linda E. Francis. 2016. Designing personalized prompts for a virtual assistant to support elderly care home residents. In PervasiveHealth. 278--282.
Letty Y. Y. Kwan, Suhui Yap, and Chi-yue Chiu. 2015. Mere exposure affects perceived descriptive norms: Implications for personal preferences and trust. Organizational Behavior and Human Decision Processes, Vol. 129 (2015), 48--58.
Google LLC. 2016. Google Assistant. Retrieved 2020-09-01 from
Gale M. Lucas, Jonathan Gratch, Aisha King, and Louis-Philippe Morency. 2014. It's only a computer: Virtual humans increase willingness to disclose. Computers in Human Behavior, Vol. 37 (2014), 94--100.
Miriam Meyerhoff. 2006. Introducing Sociolinguistics. Routledge.
Dibya Mukhopadhyay, Maliheh Shirvanian, and Nitesh Saxena. 2015. All your voices are belong to us: Stealing voices to fool humans and machines. In European Symposium on Research in Computer Security. Springer, 599--621.
P. Muppirishetty and Minha Lee. 2020. Voice User Interfaces for mental healthcare: Leveraging technology to help our inner voice. 3rd ACM Conference on Computer-Supported Cooperative Work and Social Computing, CSCW 2020 ; Conference date: 17--10--2020 Through 21--10--2020.
Clifford Nass and Scott Brave. 2005. Wired for speech: How voice activates and advances the human-computer relationship. MIT press Cambridge, MA.
Clifford Nass and Kwan Min Lee. 2001. Does computer-synthesized speech manifest personality? Experimental tests of recognition, similarity-attraction, and consistency-attraction. Journal of experimental psychology: applied, Vol. 7, 3 (2001), 171.
Clifford Nass, Youngme Moon, Brian J. Fogg, Byron Reeves, and Chris Dryer. 1995. Can computer personalities be human personalities?. In Conference companion on Human factors in computing systems. 228--229.
Clifford Nass, Jonathan Steuer, and Ellen R. Tauber. 1994. Computers Are Social Actors. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, Massachusetts, USA) (CHI '94). Association for Computing Machinery, New York, NY, USA, 72--78.
Ajaya Neupane, Nitesh Saxena, Leanne M. Hirshfield, and Sarah E. Bratt. 2019. The Crux of Voice (In) Security: A Brain Study of Speaker Legitimacy Detection. In NDSS.
Andreea Niculescu, George M. White, See Swee Lan, Ratna Utari Waloejo, and Yoko Kawaguchi. 2008. Impact of English regional accents on user acceptance of voice user interfaces. In Proceedings of the 5th Nordic conference on Human-computer interaction: building bridges. 523--526.
Kristine Nowak. 2001. Defining and differentiating copresence, social presence and presence as transportation. In Presence 2001 Conference, Philadelphia, PA. Citeseer, 1--23.
Kristine Nowak and Frank Biocca. 2003. The effect of the agency and anthropomorphism on users' sense of telepresence, copresence, and social presence in virtual environments. Presence: Teleoperators & Virtual Environments, Vol. 12, 5 (2003), 481--494.
R. Orpwood, C. Gibbs, T. Adlam, R. Faulkner, and D. Meegahawatte. 2005. The Design of Smart Homes for People with Dementia-User-Interface Aspects. Univers. Access Inf. Soc., Vol. 4, 2 (Dec. 2005), 156--164.
Sunjeong Park and Youn-kyung Lim. 2020. Investigating User Expectations on the Roles of Family-shared AI Speakers. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--13.
Malcolm R. Parks and Kory Floyd. 1996. Meanings for closeness and intimacy in friendship. Journal of Social and Personal Relationships, Vol. 13, 1 (1996), 85--107.
Pat Pataranutaporn, Tomás Vega Gálvez, Lisa Yoo, Abishkar Chhetri, and Pattie Maes. 2020. Wearable Wisdom: An Intelligent Audio-Based System for Mediating Wisdom and Advice. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA '20). Association for Computing Machinery, New York, NY, USA, 1--8.
Jennifer Pearson, Simon Robinson, Thomas Reitmaier, Matt Jones, Shashank Ahire, Anirudha Joshi, Deepak Sahoo, Nimish Maravi, and Bhakti Bhikne. 2019. StreetWise: Smart Speakers vs Human Help in Public Slum Settings. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--13.
Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice interfaces in everyday life. In proceedings of the 2018 CHI conference on human factors in computing systems. 1--12.
Byron Reeves and Clifford Nass. 1996. The media equation: How people treat computers, television, and new media like real people. Cambridge university press Cambridge, UK.
Baidu Research. 2020. A Look Back on Baidu's AI Innovations in 2019. Retrieved 2020-09-01 from
Resemble. 2019. Resemble AI. Retrieved 2021-04--15 from
Steven E. Stern, John W. Mullennix, and Ilya Yaroslavsky. 2006. Persuasion and social perception of human vs. synthetic voice across person as source and computer as source conditions. International Journal of Human-Computer Studies, Vol. 64, 1 (2006), 43--52.
Selina Jeanne Sutton, Paul Foulkes, David Kirk, and Shaun Lawson. 2019. Voice as a Design Material: Sociophonetic Inspired Design Strategies in Human-Computer Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--14.
Rie Tamagawa, Catherine I. Watson, I. Han Kuo, Bruce A. MacDonald, and Elizabeth Broadbent. 2011. The effects of synthesized voice accents on user perceptions of robots. International Journal of Social Robotics, Vol. 3, 3 (2011), 253--262.
Xu Tan, Tao Qin, Frank Soong, and Tie-Yan Liu. 2021. A Survey on Neural Speech Synthesis. arXiv preprint arXiv:2106.15561 (2021).
Christophe Veaux, Junichi Yamagishi, and Simon King. 2013. Towards personalised synthesised voices for individuals with vocal disabilities: Voice banking and reconstruction. In Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies. 107--111.
Xi Yang, Marco Aurisicchio, and Weston Baxter. 2019. Understanding Affective Experiences with Conversational Agents. Association for Computing Machinery, New York, NY, USA, 1--12.

Cited By

View all
  • (2024)In Whose Voice?: Examining AI Agent Representation of People in Social Interaction through Generative SpeechProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661555(224-245)Online publication date: 1-Jul-2024
  • (2024)Opportunities in Mental Health Support for Informal Dementia Caregivers Suffering from Verbal AgitationProceedings of the ACM on Human-Computer Interaction10.1145/36373818:CSCW1(1-26)Online publication date: 26-Apr-2024
  • (2024)Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory AugmentationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642450(1-18)Online publication date: 11-May-2024
  • Show More Cited By



Information & Contributors


Published In

cover image Proceedings of the ACM on Human-Computer Interaction
Proceedings of the ACM on Human-Computer Interaction  Volume 5, Issue CSCW2
October 2021
5376 pages
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 October 2021
Published in PACMHCI Volume 5, Issue CSCW2


Request permissions for this article.

Check for updates

Author Tags

  1. Amazon Alexa
  2. Amazon Echo
  3. conversational agent
  4. google assistant
  5. intelligent personal assistant
  6. smart speaker
  7. speech interface
  8. virtual assistant
  9. voice cloning
  10. voice design
  11. voice interface
  12. voice notification
  13. voice reminder
  14. voice synthesis
  15. voice user interface


  • Research-article

Funding Sources

  • TEC, New Zealand


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)154
  • Downloads (Last 6 weeks)27
Reflects downloads up to 22 Oct 2024

Other Metrics


Cited By

View all
  • (2024)In Whose Voice?: Examining AI Agent Representation of People in Social Interaction through Generative SpeechProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661555(224-245)Online publication date: 1-Jul-2024
  • (2024)Opportunities in Mental Health Support for Informal Dementia Caregivers Suffering from Verbal AgitationProceedings of the ACM on Human-Computer Interaction10.1145/36373818:CSCW1(1-26)Online publication date: 26-Apr-2024
  • (2024)Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory AugmentationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642450(1-18)Online publication date: 11-May-2024
  • (2024)My Voice as a Daily Reminder: Self-Voice Alarm for Daily Goal AchievementProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641932(1-16)Online publication date: 11-May-2024
  • (2023)The Bot on Speaking Terms: The Effects of Conversation Architecture on Perceptions of Conversational AgentsProceedings of the 5th International Conference on Conversational User Interfaces10.1145/3571884.3597139(1-16)Online publication date: 19-Jul-2023
  • (2023)Comparison of Two Methods for Altering the Appearance of Interviewers: Analysis of Multiple BiosignalsEngineering Psychology and Cognitive Ergonomics10.1007/978-3-031-35392-5_4(53-64)Online publication date: 23-Jul-2023
  • (2022)An extensive overview on Human-Computer Interaction (HCI) applicationi-manager’s Journal on Software Engineering10.26634/jse.17.1.1907517:1(24)Online publication date: 2022

View Options

Get Access

Login options

Full Access

View options


View or Download as a PDF file.



View online with eReader.








Share this Publication link

Share on social media