skip to main content
research-article

FollowUpAR: enabling follow-up effects in mobile AR applications

Published: 24 June 2021 Publication History

Abstract

Existing smartphone-based Augmented Reality (AR) systems are able to render virtual effects on static anchors. However, today's solutions lack the ability to render follow-up effects attached to moving anchors since they fail to track the 6 degrees of freedom (6-DoF) poses of them. We find an opportunity to accomplish the task by leveraging sensors capable of generating sparse point clouds on smartphones and fusing them with vision-based technologies. However, realizing this vision is non-trivial due to challenges in modeling radar error distributions and fusing heterogeneous sensor data. This study proposes FollowUpAR, a framework that integrates vision and sparse measurements to track object 6-DoF pose on smartphones. We derive a physical-level theoretical radar error distribution model based on an in-depth understanding of its hardware-level working principles and design a novel factor graph competent in fusing heterogeneous data. By doing so, FollowUpAR enables mobile devices to track anchor's pose accurately. We implement FollowUpAR on commodity smartphones and validate its performance with 800,000 frames in a total duration of 15 hours. The results show that FollowUpAR achieves a remarkable rotation tracking accuracy of 2.3° with a translation accuracy of 2.9mm, outperforming most existing tracking systems and comparable to state-of-the-art learning-based solutions. FollowUpAR can be integrated into ARCore and enable smartphones to render follow-up AR effects to moving objects.

References

[1]
Google Pixel 4. 2020. https://en.wikipedia.org/wiki/Pixel_4
[2]
Lie algebra. 2006. https://en.wikipedia.org/wiki/Lie_algebra
[3]
Y. Almalioglu, M. Turan, C. X. Lu, N. Trigoni, and A. Markham. 2020. Milli-RIO: Ego-Motion Estimation with Low-Cost Millimetre-Wave Radar. IEEE Sensors Journal (2020).
[4]
P. Azad, D. Münch, T. Asfour, and R. Dillmann. 2011. 6-DoF model-based tracking of arbitrarily shaped 3D objects. In 2011 IEEE International Conference on Robotics and Automation.
[5]
G. Billings and M. Johnson-Roberson. 2019. SilhoNet: An RGB Method for 6D Object Pose Estimation. IEEE Robotics and Automation Letters (2019).
[6]
C. Choi and H. I. Christensen. 2010. Real-time 3D model-based tracking using edge and keypoint features for robotic manipulation. In 2010 IEEE International Conference on Robotics and Automation.
[7]
Alvaro Collet, Manuel Martinez, and Siddhartha S Srinivasa. 2011. The MOPED framework: Object recognition and pose estimation for manipulation. The International Journal of Robotics Research (2011).
[8]
Google AR Core. 2021. https://developers.google.com/ar
[9]
Xinke Deng, Arsalan Mousavian, Yu Xiang, Fei Xia, Timothy Bretl, and Dieter Fox. 2019. Poserbpf: A rao-blackwellized particle filter for 6d object pose tracking. arXiv preprint arXiv:1905.09304 (2019).
[10]
L. Ding, M. Ali, S. Patole, and A. Dabak. 2016. Vibration parameter estimation using FMCW radar. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11]
Sedat Dogru and Lino Marques. 2020. Pursuing drones with drones using millimeter wave radar. IEEE Robotics and Automation Letters (2020).
[12]
Erqun Dong, Jingao Xu, Chenshu Wu, Yunhao Liu, and Zheng Yang. 2019. Pair-Navi: Peer-to-Peer Indoor Navigation with Mobile Visual SLAM. In Proceedings of the IEEE INFOCOM.
[13]
Liang Dong, Jingao Xu, Guoxuan Chi, Danyang Li, Xinglin Zhang, Jianbo Li, Qiang Ma, and Zheng Yang. 2020. Enabling Surveillance Cameras to Navigate. In Proceedings of the IEEE ICCCN.
[14]
Paul Furgale, Joern Rehder, and Roland Siegwart. 2013. Unified temporal and spatial calibration for multi-sensor systems. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[15]
M. Garon and J. Lalonde. 2017. Deep 6-DOF Tracking. IEEE Transactions on Visualization and Computer Graphics (2017).
[16]
J. Graeter, A. Wilczynski, and M. Lauer. 2018. LIMO: Lidar-Monocular Visual Odometry. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[17]
F. Guidi, A. Guerra, and D. Dardari. 2016. Personal Mobile Radars with Millimeter-Wave Massive Arrays for Indoor Mapping. IEEE Transactions on Mobile Computing (2016).
[18]
Richard Hartley and Andrew Zisserman. 2003. Multiple view geometry in computer vision. Cambridge university press.
[19]
Yinlin Hu, Joachim Hugonot, Pascal Fua, and Mathieu Salzmann. 2019. Segmentation-Driven 6D Object Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20]
S. S. Huang, Z. Y. Ma, T. J. Mu, H. Fu, and S. M. Hu. 2020. Lidar-Monocular Visual Odometry using Point and Line Features. In 2020 IEEE International Conference on Robotics and Automation (ICRA).
[21]
Peter J Huber. 2004. Robust statistics. John Wiley & Sons.
[22]
Chengkun Jiang, Junchen Guo, Yuan He, Meng Jin, Shuai Li, and Yunhao Liu. 2020. mmVib: micrometer-level vibration measurement with mmwave radar. In Proceedings of the ACM Mobicom.
[23]
W. Kehl, F. Manhardt, F. Tombari, S. Ilic, and N. Navab. 2017. SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again. In 2017 IEEE International Conference on Computer Vision (ICCV).
[24]
Apple AR Kit. 2021. https://developer.apple.com/augmented-reality/
[25]
Alexander Krull, Frank Michel, Eric Brachmann, Stefan Gumhold, Stephan Ihrke, and Carsten Rother. 2014. 6-DOF Model Based Tracking via Object Coordinate Regression. In Computer Vision - ACCV 2014. Springer International Publishing.
[26]
Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, and Dieter Fox. 2018. Deepim: Deep iterative matching for 6d pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV).
[27]
Jaime Lien, Nicholas Gillian, M Emre Karagozler, Patrick Amihood, Carsten Schwesig, Erik Olson, Hakim Raja, and Ivan Poupyrev. 2016. Soli: Ubiquitous gesture sensing with millimeter wave radar. ACM Transactions on Graphics (TOG) (2016).
[28]
Chris Xiaoxuan Lu, Stefano Rosa, Peijun Zhao, Bing Wang, Changhao Chen, John A Stankovic, Niki Trigoni, and Andrew Markham. 2020. See through smoke: robust indoor mapping with low-cost mmWave radar. In MobiSys.
[29]
Chris Xiaoxuan Lu, Muhamad Risqi U Saputra, Peijun Zhao, Yasin Almalioglu, Pedro PB de Gusmao, Changhao Chen, Ke Sun, Niki Trigoni, and Andrew Markham. 2020. milliEgo: single-chip mmWave radar aided egomotion estimation via deep sensor fusion. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems.
[30]
Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics 31, 5 (2015), 1147--1163.
[31]
Raul Mur-Artal and Juan D Tardós. 2017. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics 33, 5 (2017), 1255--1262.
[32]
R. Mur-Artal and J. D. Tardós. 2017. Visual-Inertial Monocular SLAM With Map Reuse. IEEE Robotics and Automation Letters (2017).
[33]
Opti-Track. 2016. https://optitrack.com/
[34]
G. Pavlakos, X. Zhou, A. Chan, K. G. Derpanis, and K. Daniilidis. 2017. 6-DoF object pose from semantic keypoints. In 2017 IEEE International Conference on Robotics and Automation (ICRA).
[35]
Ioannis Pefkianakis and Kyu-Han Kim. 2018. Accurate 3D Localization for 60 GHz Networks. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems.
[36]
Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2018. Frustum PointNets for 3D Object Detection From RGB-D Data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37]
T. Qin, P. Li, and S. Shen. 2018. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Transactions on Robotics (2018).
[38]
M. Rad and V. Lepetit. 2017. BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth. In 2017 IEEE International Conference on Computer Vision (ICCV).
[39]
Anshul Rai, Krishna Kant Chintalapudi, Venkata N. Padmanabhan, and Rijurekha Sen. 2012. Zee: Zero-effort Crowdsourcing for Indoor Localization. In Proceedings of the ACM MobiCom.
[40]
Virtual Reality and Augmented Reality Device Sales to Hit 99 Million Devices in 2021. 2017. http://www.capacitymedia.com/Article/3755961/VR-and-AR-device-shipments-to-hit-99m-by-2021
[41]
The reality of VR/AR growth. 2017. https://techcrunch.com/2017/01/11/the-reality-of-vrar-growth/
[42]
D. Scaramuzza and F. Fraundorfer. 2011. Visual Odometry [Tutorial]. IEEE Robotics Automation Magazine (2011).
[43]
D. J. Tan, F. Tombari, S. Ilic, and N. Navab. 2015. A Versatile Learning-Based 3D Temporal Tracker: Scalable, Robust, Online. In 2015 IEEE International Conference on Computer Vision (ICCV).
[44]
Dorin Ungureanu, Federica Bogo, Silvano Galliani, Pooja Sama, Xin Duan, Casey Meekhof, Jan Stühmer, Thomas J. Cashman, Bugra Tekin, Johannes L. Schönberger, Pawel Olszta, and Marc Pollefeys. 2020. HoloLens 2 Research Mode as a Tool for Computer Vision Research. arXiv:2008.11239 [cs.CV]
[45]
C. Wang, R. Martín-Martín, D. Xu, J. Lv, C. Lu, L. Fei-Fei, S. Savarese, and Y. Zhu. 2020. 6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints. In 2020 IEEE International Conference on Robotics and Automation (ICRA).
[46]
He Wang, Srinath Sridhar, Jingwei Huang, Julien Valentin, Shuran Song, and Leonidas J. Guibas. 2019. Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47]
C. Wu, F. Zhang, B. Wang, and K. J. Ray Liu. 2020. mmTrack: Passive Multi-Person Localization Using Commodity Millimeter Wave Radio. In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.
[48]
Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. 2018. PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes.
[49]
Danfei Xu, Dragomir Anguelov, and Ashesh Jain. 2018. PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50]
Jingao Xu, Hao Cao, Danyang Li, Kehong Huang, Chen Qian, Longfei Shangguan, and Zheng Yang. 2020. Edge Assisted Mobile Semantic Visual SLAM. In Proceedings of the IEEE INFOCOM.
[51]
Jingao Xu, Hengjie Chen, Kun Qian, Erqun Dong, Min Sun, Chenshu Wu, Li Zhang, and Zheng Yang. 2019. iVR: Integrated Vision and Radio Localization with Zero Human Effort. In PACM on Interactive, Mobile, Wearable and Ubiquitous Technologies.
[52]
Jingao Xu, Erqun Dong, Qiang Ma, Chenshu Wu, and Zheng Yang. 2021. Smartphone-Based Indoor Visual Navigation with Leader-Follower Mode. ACM Transactions on Sensor Networks (TOSN) (2021).
[53]
Zheng Yang, Zimu Zhou, and Yunhao Liu. 2013. From RSSI to CSI: Indoor localization via channel response. ACM Computing Surveys (CSUR) (2013).
[54]
Ji Zhang, Michael Kaess, and Sanjiv Singh. 2017. A Real-Time Method for Depth Enhanced Visual Odometry. (2017).
[55]
J. Zhang and S. Singh. 2015. Visual-lidar odometry and mapping: low-drift, robust, and fast. In 2015 IEEE International Conference on Robotics and Automation (ICRA).
[56]
A. Zhou, S. Yang, Y. Yang, Y. Fan, and H. Ma. 2019. Autonomous Environment Mapping Using Commodity Millimeter-wave Network Device. In IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.
[57]
Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing. CoRR (2018).
[58]
Yin Zhou and Oncel Tuzel. 2018. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Cited By

View all
  • (2024)LeoVR: Motion-Inspired Visual-LiDAR Fusion for Environment Depth EstimationIEEE Transactions on Mobile Computing10.1109/TMC.2023.333427123:6(7499-7516)Online publication date: Jun-2024
  • (2024)HiMoDepth: Efficient Training-Free High-Resolution On-Device Depth PerceptionIEEE Transactions on Mobile Computing10.1109/TMC.2023.329418823:5(4648-4664)Online publication date: May-2024
  • (2024)Poster Abstract: UarLogger: Logging Measurements from UWB and AR Sensors on iOS Devices2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00047(293-294)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. FollowUpAR: enabling follow-up effects in mobile AR applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MobiSys '21: Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services
    June 2021
    528 pages
    ISBN:9781450384438
    DOI:10.1145/3458864
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 6-DoF pose tracking
    2. augmented reality
    3. computer vision
    4. mmWave radar

    Qualifiers

    • Research-article

    Funding Sources

    • NSFC
    • National Key Research Plan

    Conference

    MobiSys '21
    Sponsor:

    Acceptance Rates

    MobiSys '21 Paper Acceptance Rate 36 of 166 submissions, 22%;
    Overall Acceptance Rate 274 of 1,679 submissions, 16%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)116
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 23 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)LeoVR: Motion-Inspired Visual-LiDAR Fusion for Environment Depth EstimationIEEE Transactions on Mobile Computing10.1109/TMC.2023.333427123:6(7499-7516)Online publication date: Jun-2024
    • (2024)HiMoDepth: Efficient Training-Free High-Resolution On-Device Depth PerceptionIEEE Transactions on Mobile Computing10.1109/TMC.2023.329418823:5(4648-4664)Online publication date: May-2024
    • (2024)Poster Abstract: UarLogger: Logging Measurements from UWB and AR Sensors on iOS Devices2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00047(293-294)Online publication date: 13-May-2024
    • (2024)TransformLoc: Transforming MAVs into Mobile Localization Infrastructures in Heterogeneous SwarmsIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621375(1101-1110)Online publication date: 20-May-2024
    • (2024)Environment Understanding with EdgeSLAMEdge Assisted Mobile Visual SLAM10.1007/978-981-97-3573-0_7(133-157)Online publication date: 6-May-2024
    • (2023)Toward Scalable and Controllable AR ExperimentationProceedings of the 1st ACM Workshop on Mobile Immersive Computing, Networking, and Systems10.1145/3615452.3617941(237-246)Online publication date: 6-Oct-2023
    • (2023)Push the Limit of Millimeter-wave Radar LocalizationACM Transactions on Sensor Networks10.1145/357050519:3(1-21)Online publication date: 17-Apr-2023
    • (2023)Taming Event Cameras with Bio-Inspired Architecture and Algorithm: A Case for Drone Obstacle AvoidanceProceedings of the 29th Annual International Conference on Mobile Computing and Networking10.1145/3570361.3613269(1-16)Online publication date: 2-Oct-2023
    • (2023)Demystifying Mobile Extended Reality in Web Browsers: How Far Can We Go?Proceedings of the ACM Web Conference 202310.1145/3543507.3583329(2960-2969)Online publication date: 30-Apr-2023
    • (2023)Locate, Tell, and Guide: Enabling Public Cameras to Navigate the PublicIEEE Transactions on Mobile Computing10.1109/TMC.2021.309272522:2(1010-1024)Online publication date: 1-Feb-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    ePub

    View this article in ePub.

    ePub

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media