Ian Osband

Citée par

	Toutes	Depuis 2019
Citations	7641	6808
indice h	25	25
indice i10	29	28

1600

800

400

1200

201520162017201820192020202120222023202427 74 222 467 751 1153 1361 1470 1543 528

Coauteurs

Benjamin Van RoyStanford UniversityAdresse e-mail validée de stanford.edu
Zheng WenGoogle DeepMindAdresse e-mail validée de google.com
Vikranth DwaracherlaDeepMindAdresse e-mail validée de google.com
Xiuyuan LuGoogle DeepMindAdresse e-mail validée de google.com
Daniel RussoColumbia UniversityAdresse e-mail validée de gsb.columbia.edu
Morteza IbrahimiStanford UniversityAdresse e-mail validée de stanford.edu
Brendan O'DonoghueStanford University, Google DeepMindAdresse e-mail validée de alumni.stanford.edu
Mohammad Gheshlaghi AzarCohere AIAdresse e-mail validée de google.com
Todd HesterWaymoAdresse e-mail validée de waymo.com
Bilal PiotGoogle DeepmindAdresse e-mail validée de google.com
Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Adresse e-mail validée de univ-lille.fr
Tom SchaulSenior Staff Scientist, DeepMindAdresse e-mail validée de nyu.edu
Rémi MunosDeepMindAdresse e-mail validée de inria.fr
Alexander PritzelDeepmindAdresse e-mail validée de google.com
Marc LanctotResearch Scientist, Google DeepMindAdresse e-mail validée de google.com

Suivre

Ian Osband

OpenAI

Adresse e-mail validée de openai.com - Page d'accueil

Reinforcement Learning


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1399	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1164	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1050	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	778	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in Neural Information Processing Systems 31, 2018	395	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	320	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	319	2016
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	255	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	207	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	182	2014
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	175	2017
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	174	2019
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	163*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	136	2012
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	122	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	107	2016
Bootstrapped thompson sampling and deep exploration I Osband, B Van Roy arXiv preprint arXiv:1507.00300, 2015	99	2015
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	94	2013
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	82	2019
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2024	79	2024

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–20

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par

Coauteurs