Follow
Tim Dettmers
Tim Dettmers
Allen Institute for AI; Carnegie Mellon University
Verified email at allenai.org - Homepage
Title
Cited by
Cited by
Year
Convolutional 2d knowledge graph embeddings
T Dettmers, P Minervini, P Stenetorp, S Riedel
AAAI 2018, 2018
30732018
Qlora: Efficient finetuning of quantized llms
T Dettmers, A Pagnoni, A Holtzman, L Zettlemoyer
NeurIPS 2023 (Oral), 2023
19672023
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
16152023
Llm. int8 (): 8-bit matrix multiplication for transformers at scale
T Dettmers, M Lewis, Y Belkada, L Zettlemoyer
NeurIPS 2022, 2022
844*2022
Sparse networks from scratch: Faster training without losing performance
T Dettmers, L Zettlemoyer
arXiv preprint arXiv:1907.04840, 2019
3792019
Base layers: Simplifying training of large, sparse models
M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer
ICML 2021, 2021
2342021
8-bit Approximations for Parallelism in Deep Learning
T Dettmers
ICLR 2016, 2016
2262016
8-bit Optimizers via Block-wise Quantization
T Dettmers, M Lewis, S Shleifer, L Zettlemoyer
ICLR 2022 (Spotlight), 2022
2152022
The case for 4-bit precision: k-bit inference scaling laws
T Dettmers, L Zettlemoyer
ICML 2023, 2023
1662023
Spqr: A sparse-quantized representation for near-lossless llm weight compression
T Dettmers, R Svirschevski, V Egiazarian, D Kuznedelev, E Frantar, ...
arXiv preprint arXiv:2306.03078, 2023
1612023
Branch-train-merge: Embarrassingly parallel training of expert language models
M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ...
arXiv preprint arXiv:2208.03306, 2022
1202022
Petals: Collaborative inference and fine-tuning of large models
A Borzunov, D Baranchuk, T Dettmers, M Ryabinin, Y Belkada, ...
ACL 2022, Demonstration, 2022
82*2022
Stable and low-precision training for large-scale vision-language models
M Wortsman, T Dettmers, L Zettlemoyer, A Morcos, A Farhadi, L Schmidt
NeurIPS 2023, 2023
292023
Swarm parallelism: Training large models can be surprisingly communication-efficient
M Ryabinin, T Dettmers, M Diskin, A Borzunov
NeurIPS 2023, 2023
202023
Jack the reader-A machine reading framework
D Weissenborn, P Minervini, T Dettmers, I Augenstein, J Welbl, ...
arXiv preprint arXiv:1806.08727, 2018
122018
Training transformers together
A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ...
NeurIPS 2021 Demonstration, 2022
92022
High performance natural language processing
G Ilharco, C Ilharco, I Turc, T Dettmers, F Ferreira, K Lee
EMNLP 2020, Tutorial, 2020
72020
Matformer: Nested transformer for elastic inference
S Kudugunta, A Kusupati, T Dettmers, K Chen, I Dhillon, Y Tsvetkov, ...
arXiv preprint arXiv:2310.07707, 2023
62023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
LZ Liu, T Dettmers, XV Lin, V Stoyanov, X Li
EMNLP 2023, 2023
32023
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
R Shao, J He, A Asai, W Shi, T Dettmers, S Min, L Zettlemoyer, PW Koh
arXiv preprint arXiv:2407.12854, 2024
22024
The system can't perform the operation now. Try again later.
Articles 1–20