Follow
Imanol Schlag
Imanol Schlag
Verified email at idsia.ch
Title
Cited by
Cited by
Year
Solving quantitative reasoning problems with language models
A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ...
Advances in Neural Information Processing Systems 35, 3843-3857, 2022
3972022
Linear Transformers are Secretly Fast Weight Programmers
I Schlag*, K Irie*, J Schmidhuber
International Conference on Machine Learning, 9355-9366, 2021
180*2021
Block-Recurrent Transformers
DL Hutchins*, I Schlag*, Y Wu, E Dyer, B Neyshabur
arXiv preprint arXiv:2203.07852, 2022
862022
Learning to reason with third order tensor products
I Schlag, J Schmidhuber
Advances in neural information processing systems 31, 9981-9993, 2018
782018
Enhancing the transformer with explicit relational encoding for math problem solving
I Schlag, P Smolensky, R Fernandez, N Jojic, J Schmidhuber, J Gao
arXiv preprint arXiv:1910.06611, 2019
672019
Going beyond linear transformers with recurrent fast weight programmers
K Irie*, I Schlag*, R Csordás, J Schmidhuber
Advances in Neural Information Processing Systems 34, 2021
572021
Learning Associative Inference Using Fast Weight Memory
I Schlag, T Munkhdalai, J Schmidhuber
International Conference on Learning Representations, 2021
402021
Ancient Roman coin recognition in the wild using deep learning based recognition of artistically depicted face profiles
I Schlag, O Arandjelovic
Proceedings of the IEEE International Conference on Computer Vision …, 2017
382017
Mindstorms in Natural Language-Based Societies of Mind
M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ...
arXiv preprint arXiv:2305.17066, 2023
322023
Gated fast weights for on-the-fly neural program generation
I Schlag, J Schmidhuber
NIPS Metalearning Workshop, 2017
312017
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
K Irie, I Schlag, R Csordás, J Schmidhuber
Deep RL Workshop NeurIPS 2021, 2021
302021
Solving quantitative reasoning problems with language models, 2022
A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ...
URL https://arxiv. org/abs/2206.14858, 0
29*
Large Language Model Programs
I Schlag, S Sukhbaatar, A Celikyilmaz, W Yih, J Weston, J Schmidhuber, ...
arXiv preprint arXiv:2305.05364, 2023
122023
Block-recurrent transformers (2022)
DL Hutchins, I Schlag, Y Wu, E Dyer, B Neyshabur
URL https://arxiv. org/abs/2203.07852, 0
3
Navigating Scaling Laws: Accelerating Vision Transformer's Training via Adaptive Strategies
S Anagnostidis, G Bachmann, I Schlag, T Hofmann
arXiv preprint arXiv:2311.03233, 2024
22024
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
A Stanić, D Ashley, O Serikov, L Kirsch, F Faccio, J Schmidhuber, ...
arXiv preprint arXiv:2309.11197, 2023
22023
Improving Baselines in the Wild
K Irie, I Schlag, R Csordás, J Schmidhuber
NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and …, 2021
22021
Augmenting Classic Algorithms with Neural Components for Strong Generalisation on Ambiguous and High-Dimensional Data
I Schlag, J Schmidhuber
Advances in Programming Languages and Neurosymbolic Systems Workshop, 2021
12021
Language Imbalance Can Boost Cross-lingual Generalisation
A Schäfer, S Ravfogel, T Hofmann, T Pimentel, I Schlag
arXiv preprint arXiv:2404.07982, 2024
2024
On the Effect of (Near) Duplicate Subwords in Language Modelling
A Schäfer, T Hofmann, I Schlag, T Pimentel
arXiv preprint arXiv:2404.06508, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20