Mohammad Gheshlaghi Azar
Mohammad Gheshlaghi Azar
Research Scientist at DeepMind
Verified email at google.com
Title
Cited by
Cited by
Year
Rainbow: Combining improvements in deep reinforcement learning
M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ...
Thirty-Second AAAI Conference on Artificial Intelligence, 2018
6252018
Noisy networks for exploration
M Fortunato, MG Azar, B Piot, J Menick, I Osband, A Graves, V Mnih, ...
arXiv preprint arXiv:1706.10295, 2017
3122017
Minimax regret bounds for reinforcement learning
MG Azar, I Osband, R Munos
arXiv preprint arXiv:1703.05449, 2017
1682017
Speedy Q-Learning
MG Azar, M Ghavamzadeh, HJ Kappen, R Munos
Advances in Neural Information Processing Systems, 2411-2419, 2011
101*2011
Dynamic Policy Programming
M Gheshlaghi Azar, V Gomez, HJ Kappen
Journal of Machine Learning Research 13, 3207-3245, 2012
872012
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning
A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos
arXiv preprint arXiv:1704.04651, 2017
78*2017
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model
MG Azar, R Munos, HJ Kappen
Machine learning 91 (3), 325-349, 2013
642013
Sequential transfer in multi-armed bandit with finite set of models
MG Azar, A Lazaric, E Brunskill
Advances in Neural Information Processing Systems, 2220-2228, 2013
512013
Stochastic optimization of a locally smooth function under correlated bandit feedback
MG Azar, A Lazaric, E Brunskill
31st International Conference on Machine Learning (ICML), 2014
47*2014
On the sample complexity of reinforcement learning with a generative model
MG Azar, R Munos, B Kappen
arXiv preprint arXiv:1206.6461, 2012
452012
Observe and look further: Achieving consistent performance on atari
T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ...
arXiv preprint arXiv:1805.11593, 2018
402018
Dynamic policy programming with function approximation
MG Azar, V Gómez, B Kappen
Proceedings of the Fourteenth International Conference on Artificial …, 2011
322011
Regret bounds for reinforcement learning with policy advice
MG Azar, A Lazaric, E Brunskill
Joint European Conference on Machine Learning and Knowledge Discovery in …, 2013
292013
Mel Vecerík, et al. Observe and look further: Achieving consistent performance on atari
T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ...
arXiv preprint arXiv:1805.11593, 2018
182018
Reinforcement learning with a near optimal rate of convergence
MG Azar, R Munos, M Ghavamzadeh, H Kappen
172011
Neural predictive belief representations
ZD Guo, MG Azar, B Piot, BA Pires, R Munos
arXiv preprint arXiv:1811.06407, 2018
152018
Meta-learning of sequential strategies
PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ...
arXiv preprint arXiv:1905.03030, 2019
132019
A cryptography-based approach for movement decoding
EL Dyer, MG Azar, MG Perich, HL Fernandes, S Naufel, LE Miller, ...
Nature Biomedical Engineering 1 (12), 967-976, 2017
122017
On the theory of reinforcement learning: methods, convergence analysis and sample complexity
MG Azar
[Sl: sn], 2012
102012
World discovery models
MG Azar, B Piot, BA Pires, JB Grill, F Altché, R Munos
arXiv preprint arXiv:1902.07685, 2019
92019
The system can't perform the operation now. Try again later.
Articles 1–20