Zipformer: A faster and better encoder for automatic speech recognition Z Yao, L Guo, X Yang, W Kang, F Kuang, Y Yang, Z Jin, L Lin, D Povey Proc. ICLR 2024, 2023 | 15 | 2023 |
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech C Du, Y Guo, H Wang, Y Yang, Z Niu, S Wang, H Zhang, X Chen, K Yu arXiv preprint arXiv:2401.14321, 2024 | 3 | 2024 |
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context W Kang, X Yang, Z Yao, F Kuang, Y Yang, L Guo, L Lin, D Povey Proc. ICASSP 2024, 2023 | 3 | 2023 |
Blank-regularized CTC for Frame Skipping in Neural Transducer Y Yang, X Yang, L Guo, Z Yao, W Kang, F Kuang, L Lin, X Chen, D Povey Proc. Interspeech 2023, 2023 | 3 | 2023 |
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS Y Yang, F Shen, C Du, Z Ma, K Yu, D Povey, X Chen Proc. ICASSP 2024, 2023 | 2 | 2023 |
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity Z Ma, G Yang, Y Yang, Z Gao, J Wang, Z Du, F Yu, Q Chen, S Zheng, ... arXiv preprint arXiv:2402.08846, 2024 | 1 | 2024 |
PromptASR for contextualized ASR with controllable style X Yang, W Kang, Z Yao, Y Yang, L Guo, F Kuang, L Lin, D Povey Proc. ICASSP 2024, 2023 | 1 | 2023 |
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge Y Guo, C Wang, Y Yang, H Wang, Z Ma, C Du, S Wang, H Li, S Fan, ... arXiv preprint arXiv:2404.06079, 2024 | | 2024 |
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer P Wang, Y Yang, Z Liang, T Tan, S Zhang, X Chen arXiv preprint arXiv:2309.07648, 2023 | | 2023 |
Delay-penalized CTC implemented based on Finite State Transducer Z Yao, W Kang, F Kuang, L Guo, X Yang, Y Yang, L Lin, D Povey Proc. Interspeech 2023, 2023 | | 2023 |