Miljan Martic
Miljan Martic
DeepMind
Adresse e-mail validée de google.com
Titre
Citée par
Citée par
Année
Deep reinforcement learning from human preferences
PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei
Advances in Neural Information Processing Systems, 4299-4307, 2017
3462017
AI safety gridworlds
J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ...
arXiv preprint arXiv:1711.09883, 2017
1482017
Scalable agent alignment via reward modeling: a research direction
J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg
arXiv preprint arXiv:1811.07871, 2018
432018
Penalizing side effects using stepwise relative reachability
V Krakovna, L Orseau, R Kumar, M Martic, S Legg
arXiv preprint arXiv:1806.01186, 2018
132018
Measuring and avoiding side effects using relative reachability
V Krakovna, L Orseau, M Martic, S Legg
arXiv preprint arXiv:1806.01186, 2018
132018
Deep reinforcement learning from human preferences, 2017
P Christiano, J Leike, TB Brown, M Martic, S Legg, D Amodei
URL https://arxiv. org/abs/1706 3741, 0
7
Scaling shared model governance via model splitting
M Martic, J Leike, A Trask, M Hessel, S Legg, P Kohli
arXiv preprint arXiv:1812.05979, 2018
22018
Avoiding Side Effects By Considering Future Tasks
V Krakovna, L Orseau, R Ngo, M Martic, S Legg
arXiv preprint arXiv:2010.07877, 2020
2020
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–8