Google Scholar

Causal Feature Selection Method for Contextual Multi-Armed Bandits in Recommender System

Z Zhao, Y Jiang�- arXiv preprint arXiv:2409.13888, 2024 - arxiv.org

Z Zhao, Y Jiang

arXiv preprint arXiv:2409.13888, 2024•arxiv.org

Features (aka context) are critical for contextual multi-armed bandits (MAB) performance. In
practice of large scale online system, it is important to select and implement important
features for the model: missing important features can led to sub-optimal reward outcome,
and including irrelevant features can cause overfitting, poor model interpretability, and
implementation cost. However, feature selection methods for conventional machine learning
models fail short for contextual MAB use cases, as conventional methods select features�…

Features (a.k.a. context) are critical for contextual multi-armed bandits (MAB) performance. In practice of large scale online system, it is important to select and implement important features for the model: missing important features can led to sub-optimal reward outcome, and including irrelevant features can cause overfitting, poor model interpretability, and implementation cost. However, feature selection methods for conventional machine learning models fail short for contextual MAB use cases, as conventional methods select features correlated with the outcome variable, but not necessarily causing heterogeneuous treatment effect among arms which are truely important for contextual MAB. In this paper, we introduce model-free feature selection methods designed for contexutal MAB problem, based on heterogeneous causal effect contributed by the feature to the reward distribution. Empirical evaluation is conducted based on synthetic data as well as real data from an online experiment for optimizing content cover image in a recommender system. The results show this feature selection method effectively selects the important features that lead to higher contextual MAB reward than unimportant features. Compared with model embedded method, this model-free method has advantage of fast computation speed, ease of implementation, and prune of model mis-specification issues.

arxiv.org

Show moreShow less

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Causal Feature Selection Method for Contextual Multi-Armed Bandits in Recommender System