Fairseq S2T: Fast speech-to-text modeling with fairseq C Wang, Y Tang, X Ma, A Wu, S Popuri, D Okhonko, J Pino arXiv preprint arXiv:2010.05171, 2020 | 225 | 2020 |
Direct speech-to-speech translation with discrete units A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ... arXiv preprint arXiv:2107.05604, 2021 | 114 | 2021 |
Textless speech-to-speech translation on real data A Lee, H Gong, PA Duquenne, H Schwenk, PJ Chen, C Wang, S Popuri, ... arXiv preprint arXiv:2112.08352, 2021 | 97 | 2021 |
Enhanced direct speech-to-speech translation using self-supervised pre-training and data augmentation S Popuri, PJ Chen, C Wang, J Pino, Y Adi, J Gu, WN Hsu, A Lee arXiv preprint arXiv:2204.02967, 2022 | 40 | 2022 |
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, PA Duquenne, ... arXiv preprint arXiv:2308.11596, 2023 | 36 | 2023 |
Unity: Two-pass direct speech-to-speech translation with discrete units H Inaguma, S Popuri, I Kulikov, PJ Chen, C Wang, YA Chung, Y Tang, ... arXiv preprint arXiv:2212.08055, 2022 | 24 | 2022 |
Seamless: Multilingual Expressive and Streaming Speech Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, M Duppenthaler, ... arXiv preprint arXiv:2312.05187, 2023 | 13 | 2023 |
Speech-to-speech translation for a real-world unwritten language PJ Chen, K Tran, Y Yang, J Du, J Kao, YA Chung, P Tomasello, ... arXiv preprint arXiv:2211.06474, 2022 | 9 | 2022 |
Improving speech-to-speech translation through unlabeled text XP Nguyen, S Popuri, C Wang, Y Tang, I Kulikov, H Gong ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Multilingual speech-to-speech translation into multiple target languages H Gong, N Dong, S Popuri, V Goswami, A Lee, J Pino arXiv preprint arXiv:2307.08655, 2023 | 4 | 2023 |
SpiRit-LM: Interleaved Spoken and Written Language Model TA Nguyen, B Muller, B Yu, MR Costa-Jussa, M Elbayad, S Popuri, ... arXiv preprint arXiv:2402.05755, 2024 | 2 | 2024 |
COLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech Encoders HJ Chang, N Dong, R Mavlyutov, S Popuri, YA Chung ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis Y Peng, I Kulikov, Y Yang, S Popuri, H Lu, C Wang, H Gong arXiv preprint arXiv:2403.12402, 2024 | | 2024 |
MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation Y Peng, I Kulikov, Y Yang, S Popuri, H Lu, C Wang, H Gong arXiv preprint arXiv:2403.12408, 2024 | | 2024 |
Exploring Speech Enhancement for Low-resource Speech Synthesis Z Ni, S Popuri, N Dong, K Saijo, X Zhang, GL Lan, Y Shi, V Chandra, ... arXiv preprint arXiv:2309.10795, 2023 | | 2023 |