Suivre
Vladimir Mikulik
Vladimir Mikulik
DeepMind
Adresse e-mail validée de google.com
Titre
Citée par
Citée par
Année
Inferring the effectiveness of government interventions against COVID-19
JM Brauner, S Mindermann, M Sharma, D Johnston, J Salvatier, ...
Science 371 (6531), eabd9338, 2021
10042021
Scaling language models: Methods, analysis & insights from training gopher
JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ...
arXiv preprint arXiv:2112.11446, 2021
7452021
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
4192023
Teaching language models to support answers with verified quotes
J Menick, M Trebacz, V Mikulik, J Aslanides, F Song, M Chadwick, ...
arXiv preprint arXiv:2203.11147, 2022
1372022
Alignment of language agents
Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving
arXiv preprint arXiv:2103.14659, 2021
1112021
Risks from learned optimization in advanced machine learning systems
E Hubinger, C van Merwijk, V Mikulik, J Skalse, S Garrabrant
arXiv preprint arXiv:1906.01820, 2019
972019
Specification gaming: the flip side of AI ingenuity
V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ...
DeepMind Blog 3, 2020
872020
Meta-trained agents implement Bayes-optimal agents
V Mikulik, G Delétang, T McGrath, T Genewein, M Martic, S Legg, ...
Advances in Neural Information Processing Systems 33, 2020
362020
The effectiveness and perceived burden of nonpharmaceutical interventions against COVID-19 transmission: a modelling study with 41 countries
JM Brauner, S Mindermann, M Sharma, AB Stephenson, T Gavenčiak, ...
medRxiv, 2020.05. 28.20116129, 2020
332020
Tracr: Compiled transformers as a laboratory for interpretability
D Lindner, J Kramár, S Farquhar, M Rahtz, T McGrath, V Mikulik
Advances in Neural Information Processing Systems 36, 2024
292024
Neural networks are a priori biased towards boolean functions with low entropy
C Mingard, J Skalse, G Valle-Pérez, D Martínez-Rubio, V Mikulik, ...
arXiv preprint arXiv:1909.11522, 2019
242019
Does circuit analysis interpretability scale? evidence from multiple choice capabilities in chinchilla
T Lieberum, M Rahtz, J Kramár, G Irving, R Shah, V Mikulik
arXiv preprint arXiv:2307.09458, 2023
232023
Scaling Language Models: Methods
JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ...
Analysis & Insights from Training Gopher. arXiv, 2021
182021
The hydra effect: Emergent self-repair in language model computations
T McGrath, M Rahtz, J Kramar, V Mikulik, S Legg
arXiv preprint arXiv:2307.15771, 2023
152023
Causal analysis of agent behavior for ai safety
G Déletang, J Grau-Moya, M Martic, T Genewein, T McGrath, V Mikulik, ...
arXiv preprint arXiv:2103.03938, 2021
102021
Challenges with unsupervised LLM knowledge discovery
S Farquhar, V Varma, Z Kenton, J Gasteiger, V Mikulik, R Shah
arXiv preprint arXiv:2312.10029, 2023
22023
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–16