Daniel Russo
Daniel Russo
Verified email at gsb.columbia.edu - Homepage
Title
Cited by
Cited by
Year
Learning to optimize via posterior sampling
D Russo, B Van Roy
Mathematics of Operations Research 39 (4), 1221-1243, 2014
3352014
A tutorial on thompson sampling
D Russo, B Van Roy, A Kazerouni, I Osband, Z Wen
Foundations and Trends in Machine Learning 11 (1), 1–96, 2018
2972018
An information-theoretic analysis of Thompson sampling
D Russo, B Van Roy
The Journal of Machine Learning Research 17 (1), 2442-2471, 2016
1922016
Learning to optimize via information-directed sampling
D Russo, B Van Roy
Operations Research 66 (1), 230-252, 2018
124*2018
How much does your data exploration overfit? Controlling bias via information usage.
D Russo, J Zou
IEEE Transactions on Information Theory, 2019
111*2019
Deep Exploration via Randomized Value Functions.
I Osband, B Van Roy, DJ Russo, Z Wen
Journal of Machine Learning Research 20 (124), 1-62, 2019
1072019
A finite time analysis of temporal difference learning with linear function approximation
J Bhandari, D Russo, R Singal
arXiv preprint arXiv:1806.02450, 2018
942018
Simple bayesian algorithms for best arm identification
D Russo
Conference on Learning Theory, 1417-1418, 2016
922016
Eluder dimension and the sample complexity of optimistic exploration
D Russo, B Van Roy
Advances in Neural Information Processing Systems, 2256-2264, 2013
512013
Global optimality guarantees for policy gradient methods
J Bhandari, D Russo
arXiv preprint arXiv:1906.01786, 2019
412019
Improving the expected improvement algorithm
C Qin, D Klabjan, D Russo
Advances in Neural Information Processing Systems, 5381-5391, 2017
352017
(More) efficient reinforcement learning via posterior sampling
I Osband, D Russo, B Van Roy
Advances in Neural Information Processing Systems, 3003-3011, 2013
322013
Worst-case regret bounds for exploration via randomized value functions
D Russo
Advances in Neural Information Processing Systems, 14433-14443, 2019
192019
Satisficing in time-sensitive bandit learning
D Russo, B Van Roy
arXiv preprint arXiv:1803.02855, 2018
19*2018
A note on the linear convergence of policy gradient methods
J Bhandari, D Russo
arXiv preprint arXiv:2007.11120, 2020
62020
A note on the equivalence of upper confidence bounds and gittins indices for patient agents
D Russo
arXiv preprint arXiv:1904.04732, 2019
32019
Approximation benefits of policy gradient methods with aggregated states
D Russo
arXiv preprint arXiv:2007.11684, 2020
22020
Policy gradient optimization of Thompson sampling policies
S Min, CC Moallemi, DJ Russo
arXiv preprint arXiv:2006.16507, 2020
12020
On the Futility of Dynamics in Robust Mechanism Design
S Balseiro, A Kim, DJ Russo
Available at SSRN, 2019
12019
Global Optimality Guarantees for Policy Gradient Methods
D Russo
2020
The system can't perform the operation now. Try again later.
Articles 1–20