skip to main content
short-paper

MLModelCI: An Automatic Cloud Platform for Efficient MLaaS

Published: 12 October 2020 Publication History

Abstract

MLModelCI provides multimedia researchers and developers with a one-stop platform for efficient machine learning (ML) services. The system leverages DevOps techniques to optimize, test, and manage models. It also containerizes and deploys these optimized and validated models as cloud services (MLaaS). In its essence, MLModelCI serves as a housekeeper to help users publish models. The models are first automatically converted to optimized formats for production purpose and then profiled under different settings (e.g., batch size and hardware). The profiling information can be used as guidelines for balancing the trade-off between performance and cost of MLaaS. Finally, the system dockerizes the models for ease of deployment to cloud environments. A key feature of MLModelCI is the implementation of a controller, which allows elastic evaluation which only utilizes idle workers while maintaining online service quality. Our system bridges the gap between current ML training and serving systems and thus free developers from manual and tedious work often associated with service deployment. We release the platform as an open-source project on GitHub under Apache 2.0 license, with the aim that it will facilitate and streamline more large-scale ML applications and research projects.

Supplementary Material

MP4 File (3394171.3414535.mp4)
The video introduces the paper titled "MLModelCI: An Automatic Cloud Platform for Efficient MLaaS".

References

[1]
Ryan Chard, Zhuozhao Li, Kyle Chard, Logan Ward, Yadu Babuji, Anna Woodard, Steven Tuecke, Ben Blaiszik, Michael Franklin, and Ian Foster. 2019. DLHub: Model and data serving for science. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 283--292.
[2]
Cortex 2020. Machine learning model serving infrastructure. https://github.com/cortexlabs/cortex. Accessed: 2020-06-03.
[3]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961--2969.
[4]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[5]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia. 675--678.
[6]
ModelHub 2019. A collection of deep learning models with a unified API. https://github.com/modelhub-ai/modelhub. Accessed: 2020-05-07.
[7]
NVIDIA TensorRT 2020. NVIDIA TensorRT Programmable Inference Accelerator. https://developer.nvidia.com/tensorrt. Accessed: 2020-06-01.
[8]
ONNX 2020. Open standard for machine learning interoperability. https://github.com/onnx/onnx. Accessed: 2020-05-07.
[9]
Alexander Ratner, Dan Alistarh, Gustavo Alonso, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Eric Chung, Bill Dally, Jeff Dean, et al. 2019. Sysml: The new frontier of machine learning systems. arXiv:1904.03257 (2019).
[10]
Cedric Renggli, Bojan Karlas, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, and Ce Zhang. 2019. Continuous Integration of Machine Learning Models: A Rigorous Yet Practical Treatment. Proceedings of SysML 2019 (2019).
[11]
The New Stack 2019. Add It Up: How Long Does a Machine Learning Deployment Take? https://thenewstack.io/add-it-up-how-long-does-a-machine-learningdeployment-take. Accessed: 2020-06-06.
[12]
VertaAI 2020. Open Source ML Model Versioning, Metadata, and Experiment Management. https://github.com/VertaAI/modeldb. Accessed: 2020-06-03.
[13]
Huaizheng Zhang, Linsen Dong, Guanyu Gao, Han Hu, Yonggang Wen, and Kyle Guan. 2020. DeepQoE: A Multimodal Learning Framework for Video Quality of Experience (QoE) Prediction. IEEE Transactions on Multimedia (2020).

Cited By

View all
  • (2023)Data-Centric and Model-Centric AI: Twin Drivers of Compact and Robust Industry 4.0 SolutionsApplied Sciences10.3390/app1305275313:5(2753)Online publication date: 21-Feb-2023
  • (2023)Machine Unlearning: A SurveyACM Computing Surveys10.1145/360362056:1(1-36)Online publication date: 28-Aug-2023
  • (2023)The pipeline for the continuous development of artificial intelligence models—Current state of research and practiceJournal of Systems and Software10.1016/j.jss.2023.111615199:COnline publication date: 22-Mar-2023
  • Show More Cited By

Index Terms

  1. MLModelCI: An Automatic Cloud Platform for Efficient MLaaS

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '20: Proceedings of the 28th ACM International Conference on Multimedia
    October 2020
    4889 pages
    ISBN:9781450379885
    DOI:10.1145/3394171
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. conversion
    2. inference serving
    3. model deployment
    4. profiling

    Qualifiers

    • Short-paper

    Funding Sources

    • Energy Market Authority of Singapore
    • Nanyang Technological University

    Conference

    MM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 19 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Data-Centric and Model-Centric AI: Twin Drivers of Compact and Robust Industry 4.0 SolutionsApplied Sciences10.3390/app1305275313:5(2753)Online publication date: 21-Feb-2023
    • (2023)Machine Unlearning: A SurveyACM Computing Surveys10.1145/360362056:1(1-36)Online publication date: 28-Aug-2023
    • (2023)The pipeline for the continuous development of artificial intelligence models—Current state of research and practiceJournal of Systems and Software10.1016/j.jss.2023.111615199:COnline publication date: 22-Mar-2023
    • (2022)EasyFL: A Low-Code Federated Learning Platform for DummiesIEEE Internet of Things Journal10.1109/JIOT.2022.31438429:15(13740-13754)Online publication date: 1-Aug-2022
    • (2022)RCM: Residue-aware Consolidation for Heterogeneous MLaaS Cluster2022 IEEE International Performance, Computing, and Communications Conference (IPCCC)10.1109/IPCCC55026.2022.9894301(24-31)Online publication date: 11-Nov-2022
    • (2022)Auto-ML Cyber Security Data Analysis Using Google, Azure and IBM Cloud Platforms2022 International Conference on Electrical, Computer and Energy Technologies (ICECET)10.1109/ICECET55527.2022.9872782(1-10)Online publication date: 20-Jul-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media