Peter Sunehag

Cited by

	All	Since 2019
Citations	3058	2639
h-index	16	13
i10-index	28	15

820

410

205

615

2011201220132014201520162017201820192020202120222023202420 26 45 37 64 45 51 97 165 325 456 624 803 265

Public access

View all

12 articles

1 article

available

not available

Based on funding mandates

Co-authors

Marcus HutterResearcher@DeepMind & Professor at ANUVerified email at anu.edu.au
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCLVerified email at google.com
Mayank DaswaniGoogleVerified email at google.com
Tor LattimoreDeepMindVerified email at google.com
Alex SmolaBoson AIVerified email at smola.org
Gideon DrorProfessor of Computer Science, Academic College of Tel AvivVerified email at mta.ac.il
Jochen TrumpfAustralian National UniversityVerified email at anu.edu.au
S V N VishwanathanAssociate Professor of Statistics and Computer Science, Purdue UniversityVerified email at stat.purdue.edu
Scott SannerUniversity of TorontoVerified email at mie.utoronto.ca
Bhaskara MarthiVerified email at csail.mit.edu
Joel VenessGoogle DeepMindVerified email at google.com

Peter Sunehag

Google - DeepMind

Verified email at google.com

Machine Learning Reinforcement Learning Deep Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Value-decomposition networks for cooperative multi-agent learning P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ... arXiv preprint arXiv:1706.05296, 2017	1584	2017
Deep reinforcement learning in large discrete action spaces G Dulac-Arnold, R Evans, H van Hasselt, P Sunehag, T Lillicrap, J Hunt, ... arXiv preprint arXiv:1512.07679, 2015	669	2015
Scalable evaluation of multi-agent reinforcement learning with melting pot JZ Leibo, EA Dueñez-Guzman, A Vezhnevets, JP Agapiou, P Sunehag, ... International conference on machine learning, 6187-6199, 2021	72	2021
The sample-complexity of general reinforcement learning T Lattimore, M Hutter, P Sunehag International Conference on Machine Learning, 28-36, 2013	69	2013
Learning to incentivize other learning agents J Yang, A Li, M Farajtabar, P Sunehag, E Hughes, H Zha Advances in Neural Information Processing Systems 33, 15208-15219, 2020	58	2020
Deep reinforcement learning with attention for slate markov decision processes with high-dimensional states and actions P Sunehag, R Evans, G Dulac-Arnold, Y Zwols, D Visentin, B Coppin arXiv preprint arXiv:1512.01124, 2015	53	2015
Malthusian reinforcement learning JZ Leibo, J Perolat, E Hughes, S Wheelwright, AH Marblestone, ... arXiv preprint arXiv:1812.07019, 2018	46	2018
Wearable sensor activity analysis using semi-Markov models with a grammar O Thomas, P Sunehag, G Dror, S Yun, S Kim, M Robards, A Smola, ... Pervasive and Mobile Computing 6 (3), 342-350, 2010	46	2010
Variable metric stochastic approximation theory P Sunehag, J Trumpf, SVN Vishwanathan, N Schraudolph Artificial Intelligence and Statistics, 560-566, 2009	44	2009
Reinforcement learning agents acquire flocking and symbiotic behaviour in simulated ecosystems P Sunehag, G Lever, S Liu, J Merel, N Heess, JZ Leibo, E Hughes, ... Artificial life conference proceedings, 103-110, 2019	30	2019
Value-decomposition networks for cooperative multi-agent learning. arXiv 2017 P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ... arXiv preprint arXiv:1706.05296, 2017	29	2017
Q-learning for history-based reinforcement learning M Daswani, P Sunehag, M Hutter Asian Conference on Machine Learning, 213-228, 2013	23	2013
Semi-markov kmeans clustering and activity recognition from body-worn sensors MW Robards, P Sunehag 2009 Ninth IEEE International Conference on Data Mining, 438-446, 2009	19	2009
Rationality, optimism and guarantees in general reinforcement learning P Sunehag, M Hutter The Journal of Machine Learning Research 16 (1), 1345-1390, 2015	18	2015
Feature reinforcement learning: state of the art M Daswani, P Sunehag, M Hutter Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014	16	2014
Adaptive context tree weighting A O'Neill, M Hutter, W Shao, P Sunehag 2012 Data Compression Conference, 317-326, 2012	16	2012
Melting Pot 2.0 JP Agapiou, AS Vezhnevets, EA Duéñez-Guzmán, J Matyas, Y Mao, ... arXiv preprint arXiv:2211.13746, 2022	15	2022
(Non-) equivalence of universal priors I Wood, P Sunehag, M Hutter Algorithmic Probability and Friends. Bayesian Prediction and Artificial …, 2013	15	2013
Optimistic agents are asymptotically optimal P Sunehag, M Hutter AI 2012: Advances in Artificial Intelligence: 25th Australasian Joint …, 2012	15	2012
Consistency of feature Markov processes P Sunehag, M Hutter Algorithmic Learning Theory, 360-374, 2010	15	2010

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors