Google
Dec 14, 2021We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.
We show that regularized search algorithms that penalize KL divergence from an imitation-learned policy yield higher prediction accuracy of strong humans.
In this paper, we study the problem of producing policies that are both strong and human-like in games with com- plex strategic planning like chess, Go, Hanabi,�...
Apr 25, 2022We show in chess and Go that regularizing search based on the KL divergence from an imitation-learned policy results in higher human prediction accuracy and�...
We show in chess and Go that regularizing search based on the KL divergence from an imitation-learned policy results in higher human prediction accuracy and�...
Modeling Strong and Human-Like Gameplay with KL-Regularized Search. Athul ... In chess and Go, we show that regularized search algorithms that penalize KL�...
Modeling Strong and Human-Like Gameplay with KL-Regularized Search. International Conference on Machine Learning 2022. Poster details: HALL E #816. 6:30PM - 8:�...
We show in chess and Go that regularizing search based on the KL divergence from an imitation-learned policy results in higher human prediction accuracy and�...
Dec 14, 2021We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.
Modeling strong and human-like gameplay with KL-regularized search. AP Jacob, DJ Wu, G Farina, A Lerer, H Hu, A Bakhtin, J Andreas, N Brown. International�...