Segui
Jalaj Bhandari
Jalaj Bhandari
Columbia University, Meta AI Research
Email verificata su columbia.edu - Home page
Titolo
Citata da
Citata da
Anno
A finite time analysis of temporal difference learning with linear function approximation
J Bhandari, D Russo, R Singal
Conference on learning theory, 1691-1692, 2018
3862018
Global optimality guarantees for policy gradient methods
J Bhandari, D Russo
Operations Research, 2024
2522024
On the linear convergence of policy gradient methods for finite mdps
J Bhandari, D Russo
International Conference on Artificial Intelligence and Statistics, 2386-2394, 2021
712021
A note on the linear convergence of policy gradient methods
J Bhandari, D Russo
arXiv preprint arXiv:2007.11120, 79, 2020
252020
On the tightness of an LP relaxation for rational optimization and its applications
V Avadhanula, J Bhandari, V Goyal, A Zeevi
Operations Research Letters 44 (5), 612-617, 2016
142016
Elliptical Slice Sampling with Expectation Propagation.
F Fagan, J Bhandari, JP Cunningham
UAI, 2016
112016
Optimizing long-term value for auction-based recommender systems via on-policy reinforcement learning
R Xu, J Bhandari, D Korenkevych, F Liu, Y He, A Nikulkov, Z Zhu
Proceedings of the 17th ACM Conference on Recommender Systems, 955-962, 2023
62023
Optimization foundations of reinforcement learning
J Bhandari
Columbia University, 2020
62020
Pearl: A Production-ready Reinforcement Learning Agent
Z Zhu, RS Braz, J Bhandari, D Jiang, Y Wan, Y Efroni, L Wang, R Xu, ...
arXiv preprint arXiv:2312.03814, 2023
22023
MULTI-OBJECTIVE CUSTOMER JOURNEY OPTIMIZATION
J BHANDARI, W DAI, JUN HE, T XU, Z YAN, LEI ZHANG
US Patent 20,210,217,047, 2021
2021
Annular Augmentation Sampling
F Fagan, J Bhandari, J Cunningham
Artificial Intelligence and Statistics, 139-147, 2017
2017
User Scheduling in Cognitive Radio Networks
J Bhandari, N Bolia
Journal of Computations & Modelling 3 (3), 177-193, 2013
2013
Il sistema al momento non pu eseguire l'operazione. Riprova pi tardi.
Articoli 1–12