User profiles for Konrad Zolna
Konrad ŻołnaResearch Scientist, DeepMind Verified email at google.com Cited by 2501 |
A generalist agent
Inspired by progress in large-scale language modeling, we apply a similar approach towards
building a single generalist agent beyond the realm of text outputs. The agent, which we …
building a single generalist agent beyond the realm of text outputs. The agent, which we …
Critic regularized regression
Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy
optimization from large pre-recorded datasets without online environment interaction. It …
optimization from large pre-recorded datasets without online environment interaction. It …
Hyperparameter selection for offline reinforcement learning
Offline reinforcement learning (RL purely from logged data) is an important avenue for
deploying RL techniques in real-world scenarios. However, existing hyperparameter selection …
deploying RL techniques in real-world scenarios. However, existing hyperparameter selection …
Rl unplugged: A suite of benchmarks for offline reinforcement learning
Offline methods for reinforcement learning have a potential to help bridge the gap between
reinforcement learning research and real-world applications. They make it possible to learn …
reinforcement learning research and real-world applications. They make it possible to learn …
Robocat: A self-improving foundation agent for robotic manipulation
The ability to leverage heterogeneous robotic experience from different robots and tasks to
quickly master novel skills and embodiments has the potential to transform robot learning. …
quickly master novel skills and embodiments has the potential to transform robot learning. …
Offline learning from demonstrations and unlabeled experience
Behavior cloning (BC) is often practical for robot learning because it allows a policy to be
trained offline without rewards, by supervised learning on expert demonstrations. However, BC …
trained offline without rewards, by supervised learning on expert demonstrations. However, BC …
Scaling data-driven robotics with reward sketching and batch reinforcement learning
We present a framework for data-driven robotics that makes use of a large dataset of recorded
robot experience and scales to several tasks using learned reward functions. We show …
robot experience and scales to several tasks using learned reward functions. We show …
Genie: Generative interactive environments
We introduce Genie, the first *generative interactive environment* trained in an unsupervised
manner from unlabelled Internet videos. The model can be prompted to generate an …
manner from unlabelled Internet videos. The model can be prompted to generate an …
Task-relevant adversarial imitation learning
We show that a critical vulnerability in adversarial imitation is the tendency of discriminator
networks to learn spurious associations between visual features and expert labels. When the …
networks to learn spurious associations between visual features and expert labels. When the …
Fraternal dropout
Recurrent neural networks (RNNs) are important class of architectures among neural networks
useful for language modeling and sequential prediction. However, optimizing RNNs is …
useful for language modeling and sequential prediction. However, optimizing RNNs is …