Jan Leike

Citée par

	Toutes	Depuis 2019
Citations	14156	13708
indice h	26	22
indice i10	31	26

7000

3500

1750

5250

201520162017201820192020202120222023202446 60 87 190 295 365 500 1129 6930 4442

Accès public

Tout afficher

10 articles

0 article

disponibles

non disponibles

Sur la base des exigences liées au financement

Coauteurs

Jeffrey WuOpenAIAdresse e-mail validée de openai.com
Paul ChristianoNational Institute of Standards and TechnologyAdresse e-mail validée de nist.gov
John SchulmanResearch Scientist, OpenAIAdresse e-mail validée de openai.com
Ryan LoweOpenAIAdresse e-mail validée de openai.com
Marcus HutterResearcher@DeepMind & Professor at ANUAdresse e-mail validée de anu.edu.au
Dario AmodeiCEO and Co-Founder at AnthropicAdresse e-mail validée de anthropic.com
David Scott KruegerUniversity Assistant Professor, University of CambridgeAdresse e-mail validée de cam.ac.uk
Matthias HeizmannUniversity of Freiburg, GermanyAdresse e-mail validée de heizmann.name
Tom EverittStaff Research Scientist at Google DeepMindAdresse e-mail validée de google.com
Ilya SutskeverCo-Founder and Chief Scientist of OpenAIAdresse e-mail validée de openai.com
Pushmeet KohliDeepMindAdresse e-mail validée de google.com
Andreas PodelskiProfessor of Computer Science, Freiburg UniversityAdresse e-mail validée de informatik.uni-freiburg.de
Geoffrey IrvingUK AI Safety Institute (AISI)Adresse e-mail validée de naml.us
Tegan MaharajAssistant Professor at University of TorontoAdresse e-mail validée de polymtl.ca
William SaundersOpenAIAdresse e-mail validée de cs.toronto.edu
Adam GleaveCEO at FAR AIAdresse e-mail validée de far.ai
Collin BurnsResearcher, OpenAIAdresse e-mail validée de openai.com
Andrew TraskUniversity of Oxford and OpenMinedAdresse e-mail validée de openmined.org

Suivre

Jan Leike

OpenAI

Adresse e-mail validée de openai.com - Page d'accueil

reinforcement learning deep learning agent alignment


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in Neural Information Processing Systems 35, 27730-27744, 2022	6140	2022
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in Neural Information Processing Systems 30, 4299-4307, 2017	2093	2017
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021	1982	2021
GPT-4 technical report OpenAI arXiv, 2023	1391*	2023
Reward learning from human preferences and demonstrations in Atari B Ibarz, J Leike, T Pohlen, G Irving, S Legg, D Amodei Advances in Neural Information Processing Systems, 8011-8023, 2018	331	2018
AI Safety Gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017	322	2017
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018	243	2018
Learning to Understand Goal Specifications by Modelling Reward D Bahdanau, F Hill, J Leike, E Hughes, P Kohli, E Grefenstette arXiv preprint arXiv:1806.01946, 2018	192*	2018
Let's Verify Step by Step H Lightman, V Kosaraju, Y Burda, H Edwards, B Baker, T Lee, J Leike, ... arXiv preprint arXiv:2305.20050, 2023	191	2023
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021	186	2021
Self-critiquing models for assisting human evaluators W Saunders, C Yeh, J Wu, S Bills, L Ouyang, J Ward, J Leike arXiv preprint arXiv:2206.05802, 2022	118	2022
Language models can explain neurons in language models S Bills, N Cammarata, D Mossing, H Tillman, L Gao, G Goh, I Sutskever, ... URL https://openaipublic. blob. core. windows. net/neuron-explainer/paper …, 2023	104	2023
Ranking Templates for Linear Loops J Leike, M Heizmann Logical Methods in Computer Science, 2015	95	2015
Learning human objectives by evaluating hypothetical behavior S Reddy, A Dragan, S Levine, S Legg, J Leike International Conference on Machine Learning, 8020-8029, 2020	78	2020
Linear ranking for linear lasso programs M Heizmann, J Hoenicke, J Leike, A Podelski Automated Technology for Verification and Analysis, 365-380, 2013	60	2013
Institutionalizing ethics in AI through broader impact requirements CEA Prunkl, C Ashurst, M Anderljung, H Webb, J Leike, A Dafoe Nature Machine Intelligence 3 (2), 104-110, 2021	54	2021
Geometric nontermination arguments J Leike, M Heizmann International Conference on Tools and Algorithms for the Construction and …, 2018	54*	2018
Hidden Incentives for Auto-Induced Distributional Shift D Krueger, T Maharaj, J Leike arXiv preprint arXiv:2009.09153, 2020	52*	2020
Quantifying Differences in Reward Functions A Gleave, M Dennis, S Legg, S Russell, J Leike arXiv preprint arXiv:2006.13900, 2020	52	2020
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision C Burns, P Izmailov, JH Kirchner, B Baker, L Gao, L Aschenbrenner, ... arXiv preprint arXiv:2312.09390, 2023	46	2023

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–20

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par

Coauteurs