Computer Science > Machine Learning

arXiv:2312.09148 (cs)

[Submitted on 14 Dec 2023 (v1), last revised 27 May 2024 (this version, v2)]

Title:Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting

Authors:Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Shanghang Zhang

View PDF HTML (experimental)

Abstract:Uncertainty estimation is crucial for machine learning models to detect out-of-distribution (OOD) inputs. However, the conventional discriminative deep learning classifiers produce uncalibrated closed-set predictions for OOD data. A more robust classifiers with the uncertainty estimation typically require a potentially unavailable OOD dataset for outlier exposure training, or a considerable amount of additional memory and compute to build ensemble models. In this work, we improve on uncertainty estimation without extra OOD data or additional inference costs using an alternative Split-Ensemble method. Specifically, we propose a novel subtask-splitting ensemble training objective, where a common multiclass classification task is split into several complementary subtasks. Then, each subtask's training data can be considered as OOD to the other subtasks. Diverse submodels can therefore be trained on each subtask with OOD-aware objectives. The subtask-splitting objective enables us to share low-level features across submodels to avoid parameter and computational overheads. In particular, we build a tree-like Split-Ensemble architecture by performing iterative splitting and pruning from a shared backbone model, where each branch serves as a submodel corresponding to a subtask. This leads to improved accuracy and uncertainty estimation across submodels under a fixed ensemble computation budget. Empirical study with ResNet-18 backbone shows Split-Ensemble, without additional computation cost, improves accuracy over a single model by 0.8%, 1.8%, and 25.5% on CIFAR-10, CIFAR-100, and Tiny-ImageNet, respectively. OOD detection for the same backbone and in-distribution datasets surpasses a single model baseline by, correspondingly, 2.2%, 8.1%, and 29.6% mean AUROC.

Comments:	ICML2024. Project website is available at this https URL
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.09148 [cs.LG]
	(or arXiv:2312.09148v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.09148

Submission history

From: Anthony Chen [view email]
[v1] Thu, 14 Dec 2023 17:18:44 UTC (1,134 KB)
[v2] Mon, 27 May 2024 05:59:06 UTC (1,384 KB)

Computer Science > Machine Learning

Title:Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators