research-article

“Rewind to the Jiggling Meat Part”: Understanding Voice Control of Instructional Videos in Everyday Tasks

Authors:

Donald McMillan,

Cosmin MunteanuAuthors Info & Claims

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Article No.: 58, Pages 1 - 11

https://doi.org/10.1145/3491102.3502036

Published: 28 April 2022 Publication History

Abstract

Voice interaction has long been envisioned as enabling users to transform physical interaction into hands-free, such as allowing fine-grained control of instructional videos without physically disengaging from the task at hand. While significant engineering advances have brought us closer to this ideal, we do not fully understand the user requirements for voice interactions that should be supported in such contexts. This paper presents an ecologically-valid wizard-of-oz elicitation study exploring realistic user requirements for an ideal instructional video playback control while cooking. Through the analysis of the issued commands and performed actions during this non-linear and complex task, we identify (1) patterns of command formulation, (2) challenges for design, and (3) how task and voice-based commands are interwoven in real-life. We discuss implications for the design and research of voice interactions for navigating instructional videos while performing complex tasks.

Supplementary Material

MP4 File (3491102.3502036-video-preview.mp4)

Video Preview

Download
5.51 MB

References

[1]

Saul Albert and Magnus Hamann. 2021. Putting wake words to bed: We speak wake words with systematically varied prosody, but CUIs don’t listen. In CUI 2021-3rd Conference on Conversational User Interfaces. 1–5. https://doi.org/10.1145/3469595.3469608

Digital Library

[2]

Rami Alkhatib, Tarek El Bobo, Afif Swaidan, Jad Al Soussi, Mohamad O. Diab, and Nassim Khaled. 2021. Design of Robotic Manipulator to Hollow Out Zucchini. American Society of Mechanical Engineers Digital Collection. https://doi.org/10.1115/IMECE2020-24389

[3]

Matthew P. Aylett, Per Ola Kristensson, Steve Whittaker, and Yolanda Vazquez-Alvarez. 2014. None of a CHInd: Relationship Counselling for HCI and Speech Technology. ACM, New York, NY, USA, 749–760. https://doi.org/10.1145/2559206.2578868

Digital Library

[4]

Erin Beneteau, Olivia K. Richards, Mingrui Zhang, Julie A. Kientz, Jason Yip, and Alexis Hiniker. 2019. Communication Breakdowns Between Families and Alexa. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19. ACM Press, Glasgow, Scotland Uk, 1–13. https://doi.org/10.1145/3290605.3300473

Digital Library

[5]

Dan Bohus and Alexander I Rudnicky. 2003. RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda. (2003), 4.

[6]

V. Braun and V. Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 2 (2006), 77–101. http://dx.doi.org/10.1191/1478088706qp063oa

[7]

Fabio Catania, Micol Spitale, Giulia Cosentino, and Franca Garzotto. 2020. What is the Best Action for Children to” Wake Up” and” Put to Sleep” a Conversational Agent? A Multi-Criteria Decision Analysis Approach. In Proceedings of the 2nd Conference on Conversational User Interfaces. 1–10. https://doi.org/10.1145/3405755.3406129

Digital Library

[8]

Minsuk Chang, Mina Huh, and Juho Kim. 2021. RubySlippers: Supporting Content-based Voice Navigation for How-to Videos. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14. https://doi.org/10.1145/3411764.3445131

Digital Library

[9]

Minsuk Chang, Anh Truong, Oliver Wang, Maneesh Agrawala, and Juho Kim. 2019. How to design voice based navigation for how-to videos. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–11. https://doi.org/10.1145/3290605.3300931

Digital Library

[10]

Jing-Jing Chen, Chong-Wah Ngo, Fu-Li Feng, and Tat-Seng Chua. 2018. Deep understanding of cooking procedure for cross-modal recipe retrieval. In Proceedings of the 26th ACM international conference on Multimedia. 1020–1028. https://doi.org/10.1145/3240508.3240627

Digital Library

[11]

Yi Cheng, Kate Yen, Yeqi Chen, Sijin Chen, and Alexis Hiniker. 2018. Why doesn’t it work?: voice-driven interfaces and young children’s communication repair strategies. In Proceedings of the 17th ACM Conference on Interaction Design and Children - IDC ’18. ACM Press, Trondheim, Norway, 337–348. https://doi.org/10.1145/3202185.3202749

Digital Library

[12]

Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Justin Edwards, Brendan Spillane, Emer Gilmartin, Christine Murad, and Cosmin Munteanu. 2019. What makes a good conversation? Challenges in designing truly conversational agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[13]

Benjamin R. Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. ”What can i help you with?”: infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services - MobileHCI ’17. ACM Press, Vienna, Austria, 1–12. https://doi.org/10.1145/3098279.3098539

Digital Library

[14]

Andy Crabtree and Tom Rodden. 2004. Domestic routines and design for the home. Computer Supported Cooperative Work 13, 2 (2004), 191–220. https://doi.org/10.1023/B:COSU.0000045712.26840.a4

Digital Library

[15]

Sebastien Cuendet, Indrani Medhi, Kalika Bali, and Edward Cutrell. 2013. VideoKheti: making video content accessible to low-literate and novice users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ’13. ACM Press, Paris, France, 2833. https://doi.org/10.1145/2470654.2481392

Digital Library

[16]

Pierre Dragicevic, Gonzalo Ramos, Jacobo Bibliowitcz, Derek Nowrouzezahrai, Ravin Balakrishnan, and Karan Singh. 2008. Video browsing by direct manipulation. In Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems - CHI ’08. ACM Press, Florence, Italy, 237. https://doi.org/10.1145/1357054.1357096

Digital Library

[17]

Justin Edwards, He Liu, Tianyu Zhou, Sandy J. J. Gould, Leigh Clark, Philip Doyle, and Benjamin R. Cowan. 2019. Multitasking with Alexa: how using intelligent personal assistants impacts language-based primary task performance. In Proceedings of the 1st International Conference on Conversational User Interfaces - CUI ’19. ACM Press, Dublin, Ireland, 1–7. https://doi.org/10.1145/3342775.3342785

Digital Library

[18]

Huan Feng, Kassem Fawaz, and Kang G. Shin. 2017. Continuous Authentication for Voice Assistants. arXiv:1701.04507 [cs] (Jan. 2017). https://doi.org/10.1145/3117811.3117823

Digital Library

[19]

Joel E. Fischer, Stuart Reeves, Martin Porcheron, and Rein Ove Sikveland. 2019. Progressivity for voice interface design. In Proceedings of the 1st International Conference on Conversational User Interfaces - CUI ’19. ACM Press, Dublin, Ireland, 1–8. https://doi.org/10.1145/3342775.3342788

Digital Library

[20]

Peter Grasch, Alexander Felfernig, and Florian Reinfrank. 2013. ReComment: towards critiquing-based recommendation with speech interaction. In Proceedings of the 7th ACM conference on Recommender systems - RecSys ’13. ACM Press, Hong Kong, China, 157–164. https://doi.org/10.1145/2507157.2507161

Digital Library

[21]

Philip J. Guo and Katharina Reinecke. 2014. Demographic differences in how students navigate through MOOCs. In Proceedings of the first ACM conference on Learning @ scale conference(L@S ’14). Association for Computing Machinery, New York, NY, USA, 21–30. https://doi.org/10.1145/2556325.2566247

Digital Library

[22]

Kotaro Hara and Shamsi T. Iqbal. 2015. Effect of Machine Translation in Interlingual Conversation: Lessons from a Formative Study. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems - CHI ’15. ACM Press, Seoul, Republic of Korea, 3473–3482. https://doi.org/10.1145/2702123.2702407

Digital Library

[23]

Chiori Hori, Takaaki Hori, Tim K. Marks, and John R. Hershey. 2017. Early and late integration of audio features for automatic video description. In 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 430–436. https://doi.org/10.1109/ASRU.2017.8268968

[24]

Mohit Jain, Ramachandra Kota, Pratyush Kumar, and Shwetak N. Patel. 2018. Convey: Exploring the Use of a Context View for Chatbots. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI ’18. ACM Press, Montreal QC, Canada, 1–6. https://doi.org/10.1145/3173574.3174042

Digital Library

[25]

Jiepu Jiang, Ahmed Hassan Awadallah, Rosie Jones, Umut Ozertem, Imed Zitouni, Ranjitha Gurunath Kulkarni, and Omar Zia Khan. 2015. Automatic Online Evaluation of Intelligent Assistants. In Proceedings of the 24th International Conference on World Wide Web(WWW ’15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 506–516. https://doi.org/10.1145/2736277.2741669

Digital Library

[26]

Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search. (2013), 10. https://doi.org/10.1145/2484028.2484092

Digital Library

[27]

John F. Kelley. 1983. An empirical methodology for writing user-friendly natural language computer applications. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 193–196. https://doi.org/10.1145/800045.801609

Digital Library

[28]

Chloé Kiddon, Ganesa Thandavam Ponnuraj, Luke Zettlemoyer, and Yejin Choi. 2015. Mise en Place: Unsupervised Interpretation of Instructional Recipes. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 982–992. https://doi.org/10.18653/v1/D15-1114

[29]

Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’14). Association for Computing Machinery, New York, NY, USA, 4017–4026. https://doi.org/10.1145/2556288.2556986

Digital Library

[30]

Stefan Kopp, Lars Gesellensetter, Nicole C. Krämer, and Ipke Wachsmuth. 2005. A conversational agent as museum guide–design and evaluation of a real-world application. In International workshop on intelligent virtual agents. Springer, 329–343. https://doi.org/10.1007/11550617_28

Digital Library

[31]

Philip Kortum. 2008. HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces. Elsevier.

Digital Library

[32]

Anuj Kumar, Pooja Reddy, Anuj Tewari, Rajat Agrawal, and Matthew Kam. 2012. Improving literacy in developing countries using speech recognition-supported games on mobile devices. In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI ’12. ACM Press, Austin, Texas, USA, 1149. https://doi.org/10.1145/2207676.2208564

Digital Library

[33]

Sanna Kuoppamäki, Sylvaine Tuncer, Sara Eriksson, and Donald McMillan. 2021. Designing Kitchen Technologies for Ageing in Place: A Video Study of Older Adults’ Cooking at Home. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 2 (June 2021), 1–19. https://doi.org/10.1145/3463516

Digital Library

[34]

Lin-shan Lee, James Glass, Hung-yi Lee, and Chun-an Chan. 2015. Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23, 9 (Sept. 2015), 1389–1420. https://doi.org/10.1109/TASLP.2015.2438543

Digital Library

[35]

Q. Vera Liao, Matthew Davis, Werner Geyer, Michael Muller, and N. Sadat Shami. 2016. What Can You Do?: Studying Social-Agent Orientation and Agent Proactive Interactions with an Agent for Employees. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems - DIS ’16. ACM Press, Brisbane, QLD, Australia, 264–275. https://doi.org/10.1145/2901790.2901842

Digital Library

[36]

Mike Ligthart, Timo Fernhout, Mark A. Neerincx, Kelly L. A. van Bindsbergen, Martha A. Grootenhuis, and Koen V. Hindriks. 2019. A child and a robot getting acquainted - interaction design for eliciting self-disclosure. In Proceedings of the 18th international conference on autonomous agents and MultiAgent systems(AAMAS ’19). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 61–70.

Digital Library

[37]

Nichola Lubold, Erin Walker, and Heather Pon-Barry. 2016. Effects of voice-adaptation and social dialogue on perceptions of a robotic learning companion. In 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, Christchurch, New Zealand, 255–262. https://doi.org/10.1109/HRI.2016.7451760

[38]

Ewa Luger and Abigail Sellen. 2016. ”Like Having a Really Bad PA”: The Gulf Between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems(CHI ’16). ACM, New York, NY, USA, 5286–5297. https://doi.org/10.1145/2858036.2858288

Digital Library

[39]

Donald McMillan, Barry Brown, Ikkaku Kawaguchi, Razan Jaber, Jordi Solsona Belenguer, and Hideaki Kuzuoka. 2019. Designing with Gaze: Tama – a Gaze Activated Smart-Speaker. Proc. ACM Hum.-Comput. Interact. 3, CSCW (Nov. 2019), 176:1–176:26. https://doi.org/10.1145/3359278

Digital Library

[40]

Cosmin Munteanu, Gerald Penn, and Christine Murad. 2021. Conversational Voice User Interfaces: Connecting Engineering Fundamentals to Design Considerations. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–3. https://doi.org/10.1145/3411763.3445008

Digital Library

[41]

Christine Murad and Cosmin Munteanu. 2019. I don’t know what you’re talking about, HALexa” the case for voice user interface guidelines. In Proceedings of the 1st International Conference on Conversational User Interfaces. 1–3. https://doi.org/10.1145/3342775.3342795

Digital Library

[42]

Dania Murad, Riwu Wang, Douglas Turnbull, and Ye Wang. 2018. SLIONS: A Karaoke Application to Enhance Foreign Language Learning. In 2018 ACM Multimedia Conference on Multimedia Conference - MM ’18. ACM Press, Seoul, Republic of Korea, 1679–1687. https://doi.org/10.1145/3240508.3240691

Digital Library

[43]

Sharon Oviatt, Jon Bernard, and Gina-Anne Levow. 1998. Linguistic Adaptations During Spoken and Multimodal Error Resolution. Lang Speech 41, 3-4 (July 1998), 419–442. https://doi.org/10.1177/002383099804100409

[44]

Jamie Pearson, Jiang Hu, Holly P Branigan, Martin J Pickering, and Clifford I Nass. 2006. Adaptive Language Behavior in HCI: How Expectations and Beliefs about a System Affect Users’ Word Choice. (2006), 4. https://doi.org/10.1145/1124772.1124948

Digital Library

[45]

Hannah R.M. Pelikan and Mathias Broth. 2016. Why That Nao?: How Humans Adapt to a Conventional Humanoid Robot in Taking Turns-at-Talk. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems(CHI ’16). ACM, New York, NY, USA, 4921–4932. https://doi.org/10.1145/2858036.2858478

Digital Library

[46]

John Sören Pettersson and Malin Wik. 2014. Perspectives on Ozlab in the cloud: A literature review of tools supporting Wizard-of-Oz experimentation, including an historical overview of 1971-2013 and notes on methodological issues and supporting generic tools. (2014).

[47]

Martin J. Pickering and Simon Garrod. 2006. Alignment as the basis for successful communication. Research on Language and Computation 4, 2 (2006), 203–228. https://doi.org/0.1007/s11168-006-9004-0

[48]

Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). ACM, New York, NY, USA, 640:1–640:12. https://doi.org/10.1145/3173574.3174214

Digital Library

[49]

Gisela Reyes-Cruz, Joel Fischer, and Stuart Reeves. 2019. An ethnographic study of visual impairments for voice user interface design. arXiv preprint arXiv:1904.06123(2019).

[50]

Simon Robinson, Jennifer Pearson, Shashank Ahire, Rini Ahirwar, Bhakti Bhikne, Nimish Maravi, and Matt Jones. 2018. Revisiting “hole in the wall” computing: Private smart speakers and public slum settings. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11. https://doi.org/10.1145/3173574.3174072

Digital Library

[51]

Lucas Rosenblatt, Patrick Carrington, Kotaro Hara, and Jeffrey P. Bigham. 2018. Vocal Programming for People with Upper-Body Motor Impairments. In Proceedings of the Internet of Accessible Things on - W4A ’18. ACM Press, Lyon, France, 1–10. https://doi.org/10.1145/3192714.3192821

Digital Library

[52]

Ameneh Shamekhi, Q. Vera Liao, Dakuo Wang, Rachel K. E. Bellamy, and Thomas Erickson. 2018. Face Value? Exploring the Effects of Embodiment for a Group Facilitation Agent. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI ’18. ACM Press, Montreal QC, Canada, 1–13. https://doi.org/10.1145/3173574.3173965

Digital Library

[53]

Vaishaal Shankar, Rebecca Roelofs, Horia Mania, Alex Fang, Benjamin Recht, and Ludwig Schmidt. 2020. Evaluating machine accuracy on imagenet. In International Conference on Machine Learning. PMLR, 8634–8644.

[54]

Tanya Stivers, Nicholas J. Enfield, and Stephen C. Levinson. 2010. Question-response sequences in conversation across ten languages: an introduction. Journal of Pragmatics 42(2010), 2615–2619. https://doi.org/10.1016/j.pragma.2010.04.001

[55]

Lucy Suchman. 1987. Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press.

Digital Library

[56]

K. Tanaka, T. Sasaki, Y. Tonomura, T. Nakanishi, and N. Babaguchi. 2005. PlayWatch: chart-style video playback interface. In 2005 IEEE International Conference on Multimedia and Expo. 4 pp.–. https://doi.org/10.1109/ICME.2005.1521527

[57]

Johanne R. Trippas, Damiano Spina, Lawrence Cavedon, Hideo Joho, and Mark Sanderson. 2018. Informing the Design of Spoken Conversational Search: Perspective Paper. In Proceedings of the 2018 Conference on Human Information Interaction&Retrieval - CHIIR ’18. ACM Press, New Brunswick, NJ, USA, 32–41. https://doi.org/10.1145/3176349.3176387

Digital Library

[58]

Sylvaine Tuncer, Barry Brown, and Oskar Lindwall. 2020. On Pause: How Online Instructional Videos are Used to Achieve Practical Tasks. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376759

Digital Library

[59]

Ruolin Wang, Chun Yu, Xing-Dong Yang, Weijie He, and Yuanchun Shi. 2019. EarTouch: Facilitating Smartphone Use for Visually Impaired People in Mobile and Public Scenarios. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19. ACM Press, Glasgow, Scotland Uk, 1–13. https://doi.org/10.1145/3290605.3300254

Digital Library

[60]

Steve Whittaker, Vaiva Kalnikaité, and Patrick Ehlen. 2012. Markup as you talk: establishing effective memory cues while still contributing to a meeting. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12. ACM Press, Seattle, Washington, USA, 349. https://doi.org/10.1145/2145204.2145260

Digital Library

[61]

Cara Wilson, Margot Brereton, Bernd Ploderer, and Laurianne Sitbon. 2018. MyWord: enhancing engagement, interaction and self-expression with minimally-verbal children on the autism spectrum through a personal audio-visual dictionary. In Proceedings of the 17th ACM Conference on Interaction Design and Children - IDC ’18. ACM Press, Trondheim, Norway, 106–118. https://doi.org/10.1145/3202185.3202755

Digital Library

[62]

Kuldeep Yadav, Kundan Shrivastava, S. Mohana Prasad, Harish Arsikere, Sonal Patil, Ranjeet Kumar, and Om Deshmukh. 2015. Content-driven Multi-modal Techniques for Non-linear Video Navigation. In Proceedings of the 20th International Conference on Intelligent User Interfaces. ACM, Atlanta Georgia USA, 333–344. https://doi.org/10.1145/2678025.2701408

Digital Library

[63]

Akiko Yamazaki, Keiichi Yamazaki, Matthew Burdelski, Yoshinori Kuno, and Mihoko Fukushima. 2010. Coordination of verbal and non-verbal actions in human–robot interaction at museums and exhibitions. Journal of Pragmatics 42, 9 (Sept. 2010), 2398–2414. https://doi.org/10.1016/j.pragma.2009.12.023

Cited By

Sadprasid BGutwin CBateman S(2024)Improving Video Navigation for Spatial Task Tutorials by Spatially Segmenting and Situating How-To VideosProceedings of the 2024 ACM Symposium on Spatial User Interaction10.1145/3677386.3682103(1-13)Online publication date: 7-Oct-2024
https://dl.acm.org/doi/10.1145/3677386.3682103
Oomori KIshiguro YRekimoto J(2024)SkillsInterpreter: A Case Study of Automatic Annotation of Flowcharts to Support Browsing Instructional Videos in Modern Martial Arts using Large Language ModelsProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652942(217-225)Online publication date: 4-Apr-2024
https://dl.acm.org/doi/10.1145/3652920.3652942
Yang SVermeulen JFitzmaurice GMatejka J(2024)AQuA: Automated Question-Answering in Software Tutorial Videos with Visual AnchorsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642752(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642752
Show More Cited By

Index Terms

“Rewind to the Jiggling Meat Part”: Understanding Voice Control of Instructional Videos in Everyday Tasks
1. Human-centered computing
  1. Interaction design
    1. Empirical studies in interaction design

Recommendations

Exploring Audio Icons for Content-Based Navigation in Voice User Interfaces
CUI '23: Proceedings of the 5th International Conference on Conversational User Interfaces

Voice interaction is an increasingly popular technology, allowing users to control devices and applications without the need for physical interaction or ocular attention. Augmented voice playback control features, such as audio icons, have the potential ...
How to Design Voice Based Navigation for How-To Videos
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

When watching how-to videos related to physical tasks, users' hands are often occupied by the task, making voice input a natural fit. To better understand the design space of voice interactions for how-to video navigation, we conducted three think-aloud ...
Comparing User Responses to Limited and Flexible Interaction in a Conversational Interface
HAI '18: Proceedings of the 6th International Conference on Human-Agent Interaction

The principles governing written communication have been well studied, and well incorporated in interactive computer systems. However, the role of spoken language and in human-computer interaction, while an increasingly popular modality, still needs to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

April 2022

10459 pages

ISBN:9781450391573

DOI:10.1145/3491102

Editors:
Simone Barbosa
PUC-Rio, Brazil
,
Cliff Lampe
University of Michigan, USA
,
Caroline Appert
Université Paris-Saclay, France
,
David A. Shamma
Toyota Research Institute, USA
,
Steven Drucker
Microsoft Research, USA
,
Julie Williamson
University of Glasgow, UK
,
Koji Yatani
University of Tokyo, Japan

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 April 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CHI '22

Sponsor:

SIGCHI

CHI '22: CHI Conference on Human Factors in Computing Systems

April 29 - May 5, 2022

LA, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
464
Total Downloads

Downloads (Last 12 months)125
Downloads (Last 6 weeks)6

Reflects downloads up to 24 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sadprasid BGutwin CBateman S(2024)Improving Video Navigation for Spatial Task Tutorials by Spatially Segmenting and Situating How-To VideosProceedings of the 2024 ACM Symposium on Spatial User Interaction10.1145/3677386.3682103(1-13)Online publication date: 7-Oct-2024
https://dl.acm.org/doi/10.1145/3677386.3682103
Oomori KIshiguro YRekimoto J(2024)SkillsInterpreter: A Case Study of Automatic Annotation of Flowcharts to Support Browsing Instructional Videos in Modern Martial Arts using Large Language ModelsProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652942(217-225)Online publication date: 4-Apr-2024
https://dl.acm.org/doi/10.1145/3652920.3652942
Yang SVermeulen JFitzmaurice GMatejka J(2024)AQuA: Automated Question-Answering in Software Tutorial Videos with Visual AnchorsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642752(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642752
Jaber RZhong SKuoppamäki SHosseini AGessinger IBrumby DCowan BMcmillan D(2024)Cooking With Agents: Designing Context-aware Voice InteractionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642183(1-13)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642183
Lemmer SGuo ACorso J(2023)Human-Centered Deferred Inference: Measuring User Interactions and Setting Deferral Criteria for Human-AI TeamsProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584092(681-694)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3581641.3584092
Jensen JAshbrook D(2023)Exploring Audio Icons for Content-Based Navigation in Voice User InterfacesProceedings of the 5th International Conference on Conversational User Interfaces10.1145/3571884.3604302(1-9)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3571884.3604302
Hwang AOza NCallison-Burch CHead A(2023)Rewriting the Script: Adapting Text Instructions for Voice InteractionProceedings of the 2023 ACM Designing Interactive Systems Conference10.1145/3563657.3596059(2233-2248)Online publication date: 10-Jul-2023
https://dl.acm.org/doi/10.1145/3563657.3596059

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents