Philip Thomas

Citée par

	Toutes	Depuis 2019
Citations	4503	3467
indice h	32	28
indice i10	55	48

820

410

205

615

2011201220132014201520162017201820192020202120222023202416 27 28 41 68 137 176 258 413 569 680 725 812 265

Accès public

Tout afficher

26 articles

0 article

disponibles

non disponibles

Sur la base des exigences liées au financement

Coauteurs

Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityAdresse e-mail validée de cs.stanford.edu
Georgios TheocharousAdobe ResearchAdresse e-mail validée de adobe.com
Bruno Castro da SilvaUniversity of MassachusettsAdresse e-mail validée de cs.umass.edu
Scott M. JordanPostdoctoral Fellow, University of AlbertaAdresse e-mail validée de ualberta.ca
Scott NiekumAssociate Professor, University of Massachusetts AmherstAdresse e-mail validée de cs.umass.edu
Stephen GiguereUniversity of MassachusettsAdresse e-mail validée de cs.umass.edu
Antonie J. (Ton) van den BogertProfessor of Mechanical Engineering, Cleveland State UniversityAdresse e-mail validée de csuohio.edu
Yuriy BrunManning College of Information and Computer Sciences, University of Massachusetts AmherstAdresse e-mail validée de cs.umass.edu
Chris NotaUniversity of Massachusetts, AmherstAdresse e-mail validée de cs.umass.edu
George KonidarisBrownAdresse e-mail validée de cs.brown.edu
Michael BranickyProfessor of Electrical Engineering & Computer Science, University of KansasAdresse e-mail validée de ku.edu
Erik Learned-MillerProfessor of Computer Science, University of Massachusetts AmherstAdresse e-mail validée de cs.umass.edu
Sridhar MahadevanDirector, Data Science Lab, Adobe Research & Professor, University of Massachusetts, AmherstAdresse e-mail validée de cs.umass.edu
Blossom MetevierUniversity of Massachusetts AmherstAdresse e-mail validée de umass.edu
Sarah OsentoskiVinci4dAdresse e-mail validée de vinci4d.ai
Will DabneyDeepMindAdresse e-mail validée de google.com
Robert KirschProfessor and Chair of Biomedical Engineering, Case Western Reserve UniversityAdresse e-mail validée de case.edu
Francisco M. GarciaUniversity of Massachusetts - AmherstAdresse e-mail validée de cs.umass.edu
Arthur GuezGoogle DeepMindAdresse e-mail validée de google.com
Rémi MunosDeepMindAdresse e-mail validée de inria.fr

Suivre

Philip Thomas

University of Massachusetts Amherst

Adresse e-mail validée de cs.umass.edu - Page d'accueil

Artificial Intelligence Reinforcement Learning AI Safety


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
Data-efficient off-policy policy evaluation for reinforcement learning P Thomas, E Brunskill International Conference on Machine Learning, 2139-2148, 2016	733	2016
Value function approximation in reinforcement learning using the Fourier basis G Konidaris, S Osentoski, P Thomas Proceedings of the AAAI conference on artificial intelligence 25 (1), 380-385, 2011	564	2011
High-confidence off-policy evaluation P Thomas, G Theocharous, M Ghavamzadeh Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	303	2015
High confidence policy improvement P Thomas, G Theocharous, M Ghavamzadeh International Conference on Machine Learning, 2380-2388, 2015	213	2015
Ad recommendation systems for life-time value optimization G Theocharous, PS Thomas, M Ghavamzadeh Proceedings of the 24th international conference on world wide web, 1305-1310, 2015	190	2015
Preventing undesirable behavior of intelligent machines P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill Science 366 (6468), 999-1004, 2019	188	2019
Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019	180	2019
Increasing the action gap: New operators for reinforcement learning MG Bellemare, G Ostrovski, A Guez, P Thomas, R Munos Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016	168	2016
Bias in natural actor-critic algorithms P Thomas International conference on machine learning, 441-448, 2014	157	2014
Safe reinforcement learning PS Thomas	115	2015
Is the policy gradient a gradient? C Nota, PS Thomas arXiv preprint arXiv:1906.07073, 2019	67	2019
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017	66	2017
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ... arXiv preprint arXiv:1405.6757, 2014	66	2014
Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing P Thomas, G Theocharous, M Ghavamzadeh, I Durugkar, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 31 (2), 4740-4745, 2017	63	2017
Optimizing for the future in non-stationary mdps Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ... International Conference on Machine Learning, 1414-1425, 2020	62	2020
Policy gradient methods for reinforcement learning with function approximation and action-dependent baselines PS Thomas, E Brunskill arXiv preprint arXiv:1706.06643, 2017	61	2017
Evaluating the performance of reinforcement learning algorithms S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas International Conference on Machine Learning, 4962-4973, 2020	60	2020
Risk Quantification for Policy Deployment PS Thomas, G Theocharous, M Ghavamzadeh US Patent App. 14/552,047, 2016	53	2016
Importance Sampling for Fair Policy Selection. S Doroudi, PS Thomas, E Brunskill Grantee Submission, 2017	51	2017
Offline contextual bandits with high probability fairness guarantees B Metevier, S Giguere, S Brockman, A Kobren, Y Brun, E Brunskill, ... Advances in neural information processing systems 32, 2019	50	2019

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–20

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par

Coauteurs