Naoyuki Kanda

Cited by

	All	Since 2019
Citations	4490	4077
h-index	31	28
i10-index	57	49

1400

700

350

1050

200620072008200920102011201220132014201520162017201820192020202120222023202415 20 15 15 14 18 29 34 44 31 42 61 64 108 199 488 875 1337 1059

Co-authors

Takuya YoshiokaAssemblyAIVerified email at assemblyai.com
Zhuo ChenBytedance (formerly Microsoft, Columbia University)Verified email at columbia.edu
Jinyu LiPartner Applied Science Manager, MicrosoftVerified email at microsoft.com
Xiaofei WangMicrosoftVerified email at jhu.edu
Zhong MengGoogleVerified email at google.com
Yashesh GaurMeta AIVerified email at cs.cmu.edu
Yusuke FujitaLY Corp.Verified email at linecorp.com
Xiong XiaoPrincipal Applied scientist, MicrosoftVerified email at microsoft.com
Shota HoriguchiNTT CorporationVerified email at ntt.com
Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
Yu Wu (吴俣)Microsoft Research AsiaVerified email at microsoft.com
Yao QianMicrosoftVerified email at microsoft.com
Yifan GongPrincipal Science Manager, Microsoft Corp.Verified email at microsoft.com
Hiroshi G OkunoProfessor Emeritus, Kyoto University, Adjunct Researcher, Waseda UniversityVerified email at nue.org
Kazunori KomataniProfessor, Osaka UniversityVerified email at sanken.osaka-u.ac.jp
Hiroshi TsujinoHonda R&D Co., Ltd.Verified email at jp.honda
Kazuhiro NakadaiTokyo Institute of TechnologyVerified email at ra.sc.e.titech.ac.jp
Christoph BoeddekerPaderborn UniversityVerified email at mail.upb.de
Aswin Shanmugam SubramanianMicrosoftVerified email at microsoft.com
Vimal ManoharMeta Platforms Inc.Verified email at meta.com

Naoyuki Kanda

Microsoft

Verified email at microsoft.com

Speech Recognition Speech Synthesis Speech and Language Processing Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Wavlm: Large-scale self-supervised pre-training for full stack speech processing S Chen, C Wang, Z Chen, Y Wu, S Liu, Z Chen, J Li, N Kanda, T Yoshioka, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1505-1518, 2022	1209	2022
A review of speaker diarization: Recent advances with deep learning TJ Park, N Kanda, D Dimitriadis, KJ Han, S Watanabe, S Narayanan Computer Speech & Language 72, 101317, 2022	331	2022
CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings S Watanabe, M Mandel, J Barker, E Vincent, A Arora, X Chang, ... arXiv preprint arXiv:2004.09249, 2020	307	2020
End-to-end neural speaker diarization with self-attention Y Fujita, N Kanda, S Horiguchi, Y Xue, K Nagamatsu, S Watanabe 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019	251	2019
End-to-end neural speaker diarization with permutation-free objectives Y Fujita, N Kanda, S Horiguchi, K Nagamatsu, S Watanabe arXiv preprint arXiv:1909.05952, 2019	244	2019
Elastic spectral distortion for low resource speech recognition with deep neural networks N Kanda, R Takeda, Y Obuchi Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on …, 2013	148	2013
Internal language model estimation for domain-adaptive end-to-end speech recognition Z Meng, S Parthasarathy, E Sun, Y Gaur, N Kanda, L Lu, X Chen, R Zhao, ... 2021 IEEE Spoken Language Technology Workshop (SLT), 243-250, 2021	102	2021
Serialized output training for end-to-end overlapped speech recognition N Kanda, Y Gaur, X Wang, Z Meng, T Yoshioka arXiv preprint arXiv:2003.12687, 2020	101	2020
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis D Raj, P Denisov, Z Chen, H Erdogan, Z Huang, M He, S Watanabe, J Du, ... 2021 IEEE spoken language technology workshop (SLT), 897-904, 2021	86	2021
Joint speaker counting, speech recognition, and speaker identification for overlapped speech of any number of speakers N Kanda, Y Gaur, X Wang, Z Meng, Z Chen, T Zhou, T Yoshioka arXiv preprint arXiv:2006.10930, 2020	74	2020
Guided source separation meets a strong ASR backend: Hitachi/Paderborn University joint investigation for dinner party ASR N Kanda, C Boeddeker, J Heitkaemper, Y Fujita, S Horiguchi, ... arXiv preprint arXiv:1905.12230, 2019	72	2019
Microsoft speaker diarization system for the voxceleb speaker recognition challenge 2020 X Xiao, N Kanda, Z Chen, T Zhou, T Yoshioka, S Chen, Y Zhao, G Liu, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	71	2021
A two-layer model for behavior and dialogue planning in conversational service robots M Nakano, Y Hasegawa, K Nakadai, T Nakamura, J Takeuchi, T Torii, ... 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2005	69	2005
Multi-domain spoken dialogue system with extensibility and robustness against speech recognition errors K Komatani, N Kanda, M Nakano, K Nakadai, H Tsujino, T Ogata, ... Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, 9-17, 2006	56	2006
Maximum a posteriori Based Decoding for CTC Acoustic Models N Kanda, X Lu, H Kawai Interspeech 2016, 1868-1872, 2016	55	2016
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays N Kanda, R Ikeshita, S Horiguchi, Y Fujita, K Nagamatsu, X Wang, ... Proc. CHiME-5, 6-10, 2018	54	2018
Internal language model training for domain-adaptive end-to-end speech recognition Z Meng, N Kanda, Y Gaur, S Parthasarathy, E Sun, L Lu, X Chen, J Li, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	51	2021
A multi-expert model for dialogue and behavior control of conversational robots and agents M Nakano, Y Hasegawa, K Funakoshi, J Takeuchi, T Torii, K Nakadai, ... Knowledge-Based Systems 24 (2), 248-256, 2011	48	2011
Face-voice matching using cross-modal embeddings S Horiguchi, N Kanda, K Nagamatsu Proceedings of the 26th ACM international conference on Multimedia, 1011-1019, 2018	47	2018
Streaming multi-talker ASR with token-level serialized output training N Kanda, J Wu, Y Wu, X Xiao, Z Meng, X Wang, Y Gaur, Z Chen, J Li, ... arXiv preprint arXiv:2202.00842, 2022	44	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors