Code llama: Open foundation models for code B Roziere, J Gehring, F Gloeckle, S Sootla, I Gat, XE Tan, Y Adi, J Liu, ... arXiv preprint arXiv:2308.12950, 2023 | 1068 | 2023 |
High fidelity neural audio compression A Défossez, J Copet, G Synnaeve, Y Adi arXiv preprint arXiv:2210.13438, 2022 | 476 | 2022 |
On generative spoken language modeling from raw audio K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ... Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021 | 310 | 2021 |
Speech resynthesis from discrete disentangled self-supervised representations A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ... arXiv preprint arXiv:2104.00355, 2021 | 282 | 2021 |
Simple and controllable music generation J Copet, F Kreuk, I Gat, T Remez, D Kant, G Synnaeve, Y Adi, A Défossez Advances in Neural Information Processing Systems 36, 2024 | 281 | 2024 |
Audiogen: Textually guided audio generation F Kreuk, G Synnaeve, A Polyak, U Singer, A Défossez, J Copet, D Parikh, ... arXiv preprint arXiv:2209.15352, 2022 | 259 | 2022 |
The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 220 | 2024 |
Text-free prosody-aware generative spoken language modeling E Kharitonov, A Lee, A Polyak, Y Adi, J Copet, K Lakhotia, TA Nguyen, ... arXiv preprint arXiv:2109.03264, 2021 | 108 | 2021 |
Generative spoken dialogue language modeling TA Nguyen, E Kharitonov, J Copet, Y Adi, WN Hsu, A Elkahky, ... Transactions of the Association for Computational Linguistics 11, 250-266, 2023 | 84 | 2023 |
Textless speech emotion conversion using discrete and decomposed representations F Kreuk, A Polyak, J Copet, E Kharitonov, TA Nguyen, M Rivière, WN Hsu, ... arXiv preprint arXiv:2111.07402, 2021 | 62 | 2021 |
Textually pretrained speech language models M Hassid, T Remez, TA Nguyen, I Gat, A Conneau, F Kreuk, J Copet, ... Advances in Neural Information Processing Systems 36, 2024 | 38 | 2024 |
Stop: A dataset for spoken task oriented semantic parsing P Tomasello, A Shrivastava, D Lazar, PC Hsu, D Le, A Sagar, A Elkahky, ... 2022 IEEE Spoken Language Technology Workshop (SLT), 991-998, 2023 | 30 | 2023 |
Expresso: A benchmark and analysis of discrete expressive speech resynthesis TA Nguyen, WN Hsu, A d'Avirro, B Shi, I Gat, M Fazel-Zarani, T Remez, ... arXiv preprint arXiv:2308.05725, 2023 | 26 | 2023 |
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation M Lavechin, M Métais, H Titeux, A Boissonnet, J Copet, M Rivière, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 19 | 2023 |
Masked audio generation using a single non-autoregressive transformer A Ziv, I Gat, GL Lan, T Remez, F Kreuk, A Défossez, J Copet, G Synnaeve, ... arXiv preprint arXiv:2401.04577, 2024 | 18 | 2024 |
ASR4REAL: An extended benchmark for speech models M Riviere, J Copet, G Synnaeve arXiv preprint arXiv:2110.08583, 2021 | 14 | 2021 |
textless-lib: A library for textless spoken language processing E Kharitonov, J Copet, K Lakhotia, TA Nguyen, P Tomasello, A Lee, ... arXiv preprint arXiv:2202.07359, 2022 | 12 | 2022 |
Augmentation invariant discrete representation for generative spoken language modeling I Gat, F Kreuk, TA Nguyen, A Lee, J Copet, G Synnaeve, E Dupoux, Y Adi arXiv preprint arXiv:2209.15483, 2022 | 9 | 2022 |
On the robustness of self-supervised representations for spoken language modeling I Gat, F Kreuk, A Lee, J Copet, G Synnaeve, E Dupoux, Y Adi arXiv preprint arXiv:2209.15483, 2022 | 7 | 2022 |
Generative Spoken Language Model based on continuous word-sized audio tokens R Algayres, Y Adi, TA Nguyen, J Copet, G Synnaeve, B Sagot, E Dupoux arXiv preprint arXiv:2310.05224, 2023 | 6 | 2023 |