skip to main content
poster

Humans and smart environments: a novel multimodal interaction approach

Published: 14 November 2011 Publication History

Abstract

In this paper, we describe a multimodal approach for human-smart environment interaction. The input interaction is based on three modalities: deictic gestures, symbolic gestures and isolated-words. The deictic gesture is interpreted using the PTAMM (Parallel Tracking and Multiple Mapping) method exploiting a camera handheld or worn on the user arm. The PTAMM algorithm tracks in real-time the position and orientation of the hand in the environment. This information is used to point real or virtual objects, previously added to the environment, using the optical camera axis. Symbolic hand-gestures and isolated voice commands are recognized and used to interact with the pointed target. Haptic and acoustic feedbacks are provided to the user in order to improve the quality of the interaction. A complete prototype has been realized and a first usability evaluation, assessed with the help of 10 users has shown positive results.

References

[1]
D. J. Cook and S. K. Das, "How smart are our environments? An updated look at the state of the art," Pervasive and Mobile Computing, vol. 3, no. 2, pp. 53--73, 2007.
[2]
R. a Bolt and E. Herranz, "Two-handed gesture in multi-modal natural dialog," Proceedings of the 5th annual ACM symposium on User interface software and technology - UIST '92, pp. 7--14, 1992.
[3]
M. Karam, "A taxonomy of gestures in human computer interactions," pp. 1--45, 2005.
[4]
S. Kopp, P. Tepper, and J. Cassell, "Towards integrated microplanning of language and iconic gesture for multimodal output," in Proceedings of the 6th international conference on Multimodal interfaces, 2004, pp. 97--104.
[5]
R. A. Bolt, "Put-that-there": Voice and gesture at the graphics interface. New York, New York, USA: ACM Press, 1980, pp. 262--270.
[6]
R. E. Kahn, M. J. Swain, P. N. Prokopowicz, and R. J. Firby, "Gesture recognition using the Perseus architecture," Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 734--741, 1996.
[7]
N. Jojic, B. Brumitt, B. Meyers, S. Harris, and T. Huang, Detection and estimation of pointing gestures in dense disparity maps. IEEE Comput. Soc, 2000, pp. 468--475.
[8]
K. Nickel and R. Stiefelhagen, "Visual recognition of pointing gestures for human-robot interaction," Image and Vision Computing, vol. 25, no. 12, pp. 1875--1884, Dec. 2007.
[9]
R. Cipolla, "Human-robot interface by pointing with uncalibrated stereo vision," Image and Vision Computing, vol. 14, no. 3, pp. 171--178, Apr. 1996.
[10]
G. Klein and D. Murray, Parallel Tracking and Mapping for Small AR Workspaces. IEEE, 2007, pp. 1--10.
[11]
R. Castle, G. Klein, and D. W. Murray, Video-rate localization in multiple maps for wearable augmented reality. IEEE, 2008, pp. 15--22.
[12]
R. O. Castle and D. W. Murray, "Object recognition and localization while tracking and mapping," 2009 8th IEEE International Symposium on Mixed and Augmented Reality, pp. 179--180, Oct. 2009.
[13]
S. Carrino, E. Mugellini, O. Abou Khaled, and R. Ingold, "ARAMIS: Toward a Hybrid Approach for Human- Environment Interaction," in Human-Computer Interaction. Towards Mobile and Intelligent Interaction Environments, HCII 2011, pp. 165--174.
[14]
A. Jaimes and N. Sebe, "Multimodal human-computer interaction: A survey," Computer Vision and Image Understanding, vol. 108, no. 1--2, pp. 116--134, Oct. 2007.
[15]
M. Turk, "Multimodal human-computer interaction," in Real-Time Vision for Human-Computer Interaction, Real-Time., B. Kisačanin, V. Pavlović, and T. Huang, Eds. Springer, 2005, pp. 269--283.
[16]
D. Perroud, F. Barras, S. Pierroz, E. Mugellini, and O. A. Khaled, "Framework for Development of a Smart Environment: Conception and Use of the NAIF Framework," in 2011 11th Annual International Conference on New Technologies of Distributed Systems, 2011, pp. 1--7.
[17]
B. K. P. Horn and B. G. Schunck, "Determining optical flow," Artificial intelligence, vol. 17, no. 1--3, pp. 185--203, 1981.
[18]
T. De Campos, W. Mayol Cuevas, and D. Murray, Directing the Attention of a Wearable Camera by Pointing Gestures. IEEE, 2006, pp. 179--186.
[19]
M. Bauer, G. Kortuem, and Z. Segall, "'Where are you pointing at?' A study of remote collaboration in a wearable videoconference system," Digest of Papers. Third International Symposium on Wearable Computers, IEEE Comput. Soc, pp. 151--158.
[20]
L. Nigay and J. Coutaz, "A design space for multimodal systems: concurrent processing and data fusion," in Proceedings of the INTERACT'93 and CHI'93 conference on Human factors in computing systems, 1993, pp. 172--178.
[21]
J. Coutaz, L. Nigay, D. Salber, A. Blandford, J. May, and R. Young, "Four easy pieces for assessing the usability of multimodal interaction: the CARE properties," in Proceedings of INTERACT, 1995, vol. 95, pp. 115--120.
[22]
S. Mitra and T. Acharya, "Gesture Recognition: A Survey," IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol. 37, no. 3, pp. 311--324, May. 2007.
[23]
F. K. H. Quek, "Unencumbered gestural interaction," Multimedia, IEEE, vol. 3, no. 4, pp. 36--47, 1996.
[24]
J. Brooke, "SUS-A quick and dirty usability scale," Usability evaluation in industry, pp. 189--194, 1996.
[25]
I.S. MacKenzie, W. Buxton "Extending Fitts' law to two-dimensional tasks," in Proceedings of the SIGCHI conference on Human factors in computing systems - CHI'92, pp.219--226

Cited By

View all
  • (2021)Method for Eye-Controlled Interaction for Digital Interface Function IconsAdvances in Ergonomics in Design10.1007/978-3-030-79760-7_90(752-760)Online publication date: 29-Jun-2021
  • (2016)Multimodal human attention detection for reading from facial expression, eye gaze, and mouse dynamicsACM SIGAPP Applied Computing Review10.1145/3015297.301530116:3(37-49)Online publication date: 4-Nov-2016
  • (2016)Towards addressee recognition in smart robotic environmentsProceedings of the 1st Workshop on Embodied Interaction with Smart Environments10.1145/3008028.3008030(1-6)Online publication date: 16-Nov-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '11: Proceedings of the 13th international conference on multimodal interfaces
November 2011
432 pages
ISBN:9781450306416
DOI:10.1145/2070481
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 November 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ambient intelligence
  2. gesture recognition
  3. multimodal interaction
  4. pointing
  5. wearable and pervasive computing

Qualifiers

  • Poster

Conference

ICMI'11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Method for Eye-Controlled Interaction for Digital Interface Function IconsAdvances in Ergonomics in Design10.1007/978-3-030-79760-7_90(752-760)Online publication date: 29-Jun-2021
  • (2016)Multimodal human attention detection for reading from facial expression, eye gaze, and mouse dynamicsACM SIGAPP Applied Computing Review10.1145/3015297.301530116:3(37-49)Online publication date: 4-Nov-2016
  • (2016)Towards addressee recognition in smart robotic environmentsProceedings of the 1st Workshop on Embodied Interaction with Smart Environments10.1145/3008028.3008030(1-6)Online publication date: 16-Nov-2016
  • (2016)The Integration Method of Multimodal Human-Computer Interaction Framework2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)10.1109/IHMSC.2016.245(545-550)Online publication date: Aug-2016
  • (2015)Where, what, why and how - 3W1HProceedings of the 14th Brazilian Symposium on Human Factors in Computing Systems10.1145/3148456.3148472(1-10)Online publication date: 3-Nov-2015
  • (2015)The Research of Human-Computer Interaction Model Based on the Morhpable Model Based 3D Face Synthesis in the Speech Rehabilitation for Deaf ChildrenProceedings of the 2015 IEEE Fifth International Conference on Big Data and Cloud Computing10.1109/BDCloud.2015.17(191-194)Online publication date: 26-Aug-2015
  • (2015)Gesture recognition corpora and toolsComputer Vision and Image Understanding10.1016/j.cviu.2014.07.004131:C(72-87)Online publication date: 1-Feb-2015
  • (2014)EventBreakACM SIGPLAN Notices10.1145/2714064.266023349:10(33-47)Online publication date: 15-Oct-2014
  • (2014)Using web corpus statistics for program analysisACM SIGPLAN Notices10.1145/2714064.266022649:10(49-65)Online publication date: 15-Oct-2014
  • (2014)Determinacy in static analysis for jQueryACM SIGPLAN Notices10.1145/2714064.266021449:10(17-31)Online publication date: 15-Oct-2014
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media