Trust region policy optimization J Schulman, S Levine, P Abbeel, M Jordan, P Moritz International conference on machine learning, 1889-1897, 2015 | 3112 | 2015 |
Model-agnostic meta-learning for fast adaptation of deep networks C Finn, P Abbeel, S Levine arXiv preprint arXiv:1703.03400, 2017 | 2846 | 2017 |
Infogan: Interpretable representation learning by information maximizing generative adversarial nets X Chen, Y Duan, R Houthooft, J Schulman, I Sutskever, P Abbeel Advances in neural information processing systems 29, 2172-2180, 2016 | 2510 | 2016 |
Apprenticeship learning via inverse reinforcement learning P Abbeel, AY Ng Proceedings of the twenty-first international conference on Machine learning, 1, 2004 | 2475 | 2004 |
End-to-end training of deep visuomotor policies S Levine, C Finn, T Darrell, P Abbeel The Journal of Machine Learning Research 17 (1), 1334-1373, 2016 | 2165 | 2016 |
Introduction to statistical relational learning D Koller, N Friedman, S Džeroski, C Sutton, A McCallum, A Pfeffer, ... MIT press, 2007 | 1714 | 2007 |
Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor T Haarnoja, A Zhou, P Abbeel, S Levine arXiv preprint arXiv:1801.01290, 2018 | 1210 | 2018 |
High-dimensional continuous control using generalized advantage estimation J Schulman, P Moritz, S Levine, M Jordan, P Abbeel arXiv preprint arXiv:1506.02438, 2015 | 1157 | 2015 |
Benchmarking deep reinforcement learning for continuous control Y Duan, X Chen, R Houthooft, J Schulman, P Abbeel International Conference on Machine Learning, 1329-1338, 2016 | 1053 | 2016 |
Multi-agent actor-critic for mixed cooperative-competitive environments R Lowe, YI Wu, A Tamar, J Harb, OAIP Abbeel, I Mordatch Advances in neural information processing systems, 6379-6390, 2017 | 1049 | 2017 |
Domain randomization for transferring deep neural networks from simulation to the real world J Tobin, R Fong, A Ray, J Schneider, W Zaremba, P Abbeel 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2017 | 966 | 2017 |
Discriminative probabilistic models for relational data B Taskar, P Abbeel, D Koller arXiv preprint arXiv:1301.0604, 2012 | 883 | 2012 |
Hindsight experience replay M Andrychowicz, F Wolski, A Ray, J Schneider, R Fong, P Welinder, ... Advances in neural information processing systems, 5048-5058, 2017 | 835 | 2017 |
Guided policy search S Levine, V Koltun International Conference on Machine Learning, 1-9, 2013 | 790 | 2013 |
An application of reinforcement learning to aerobatic helicopter flight P Abbeel, A Coates, M Quigley, AY Ng Advances in neural information processing systems, 1-8, 2007 | 688 | 2007 |
A survey of research on cloud robotics and automation B Kehoe, S Patil, P Abbeel, K Goldberg IEEE Transactions on automation science and engineering 12 (2), 398-409, 2015 | 672 | 2015 |
Link prediction in relational data B Taskar, MF Wong, P Abbeel, D Koller Advances in neural information processing systems, 659-666, 2004 | 573 | 2004 |
Autonomous helicopter aerobatics through apprenticeship learning P Abbeel, A Coates, AY Ng The International Journal of Robotics Research 29 (13), 1608-1639, 2010 | 544 | 2010 |
A simple neural attentive meta-learner N Mishra, M Rohaninejad, X Chen, P Abbeel arXiv preprint arXiv:1707.03141, 2017 | 516 | 2017 |
Guided cost learning: Deep inverse optimal control via policy optimization C Finn, S Levine, P Abbeel International conference on machine learning, 49-58, 2016 | 501 | 2016 |