Minatar: An atari-inspired testbed for thorough and reproducible reinforcement learning experiments K Young, T Tian arXiv preprint arXiv:1903.03176, 2019 | 125 | 2019 |
Neurohex: A deep q-learning hex agent K Young, G Vasan, R Hayward Workshop on Computer Games, 3-18, 2016 | 32 | 2016 |
Minatar: An atari-inspired testbed for more efficient reinforcement learning experiments K Young, T Tian arXiv preprint arXiv:1903.03176 59, 60, 2019 | 29 | 2019 |
The benefits of model-based generalization in reinforcement learning K Young, A Ramesh, L Kirsch, J Schmidhuber arXiv preprint arXiv:2211.02222, 2022 | 23 | 2022 |
Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return. C Sherstan, DR Ashley, B Bennett, K Young, A White, M White, RS Sutton UAI, 63-72, 2018 | 21 | 2018 |
Metatrace: Online step-size tuning by meta-gradient descent for reinforcement learning control K Young, B Wang, ME Taylor arXiv preprint arXiv:1805.04514 19, 2018 | 18 | 2018 |
Metatrace actor-critic: Online step-size tuning by meta-gradient descent for reinforcement learning control K Young, B Wang, ME Taylor arXiv preprint arXiv:1805.04514, 2018 | 17 | 2018 |
Directly estimating the variance of the {\lambda}-return using temporal-difference methods C Sherstan, B Bennett, K Young, DR Ashley, A White, M White, RS Sutton arXiv preprint arXiv:1801.08287, 2018 | 16 | 2018 |
Integrating episodic memory into a reinforcement learning agent using reservoir sampling KJ Young, RS Sutton, S Yang arXiv preprint arXiv:1806.00540, 2018 | 9 | 2018 |
Understanding the pathologies of approximate policy evaluation when combined with greedification in reinforcement learning K Young, RS Sutton arXiv preprint arXiv:2010.15268, 2020 | 8 | 2020 |
Minatar: an atari-inspired testbed for more efficient reinforcement learning experiments (2019) K Young, T Tian arXiv preprint arXiv:1903.03176, 2019 | 7 | 2019 |
Variance Reduced Advantage Estimation with Hindsight Credit Assignment K Young arXiv preprint arXiv:1911.08362, 2019 | 5 | 2019 |
Hindsight network credit assignment: Efficient credit assignment in networks of discrete stochastic units K Young Proceedings of the AAAI Conference on Artificial Intelligence 36 (8), 8919-8926, 2022 | 4 | 2022 |
A reverse Hex solver K Young, RB Hayward International Conference on Computers and Games, 137-148, 2016 | 4 | 2016 |
Doubly-asynchronous value iteration: Making value iteration asynchronous in actions T Tian, K Young, RS Sutton Advances in Neural Information Processing Systems 35, 5575-5585, 2022 | 2 | 2022 |
Sequence compression speeds up credit assignment in reinforcement learning AA Ramesh, K Young, L Kirsch, J Schmidhuber arXiv preprint arXiv:2405.03878, 2024 | 1 | 2024 |
Iterative Option Discovery for Planning, by Planning K Young, RS Sutton arXiv preprint arXiv:2310.01569, 2023 | 1 | 2023 |
Hindsight Network Credit Assignment K Young arXiv preprint arXiv:2011.12351, 2020 | | 2020 |
MOHEX WINS 2016 HEX 11X11 AND 13X13 TOURNAMENTS R Hayward, N Weninger, K Young, K Takada, T Zhang | | |
Learning What to Remember with Online Policy Gradient Over a Reservoir K Young, RS Sutton | | |