|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 11 occurrences of 10 keywords
|
|
|
Results
Found 381 publication records. Showing 381 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
105 | Jing Deng, Thomas Fang Zheng, Wenhu Wu |
UBM Based Speaker Segmentation and Clustering for 2-Speaker Detection. |
ISCSLP |
2006 |
DBLP DOI BibTeX RDF |
Multi-speaker, Speaker segmentation, Speaker clustering, Speaker Detection |
47 | Fabio Castaldo, Daniele Colibro, Emanuele Dalmasso, Pietro Laface, Claudio Vair |
Stream-based speaker segmentation using speaker factors and eigenvoices. |
ICASSP |
2008 |
DBLP DOI BibTeX RDF |
|
47 | Zhu Liu 0001, Murat Saraclar |
Speaker Segmentation and Adaptation for Speech Recognition on Multiple-Speaker Audio Conference Data. |
ICME |
2007 |
DBLP DOI BibTeX RDF |
|
46 | Jigish Trivedi, Anutosh Maitra, Suman K. Mitra |
A Hybrid Approach to Speaker Recognition in Multi-speaker Environment. |
PReMI |
2005 |
DBLP DOI BibTeX RDF |
Speech recognition, ICA, Vector Quantization, MFCC |
41 | Frantz Clermont |
A Linear-Scaling Approach to Speaker Variability in Poly-segmental Formant Ensembles. |
Speaker Classification (2) |
2007 |
DBLP DOI BibTeX RDF |
Poly-Segmental Ensembles, Linear Scaling, Speaker Variability, Formant-Frequency Patterns |
32 | Guanjun Li, Wei Xue, Wenju Liu, Jiangyan Yi, Jianhua Tao 0001 |
GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
28 | Ying Li, Chitra Dorai |
Analyzing discussion scene contents in instructional videos. |
ACM Multimedia |
2004 |
DBLP DOI BibTeX RDF |
discussion pattern detection, discussion scene, instructional video content analysis, e-learning, speaker clustering |
24 | Themos Stafylakis, Ladislav Mosner, Oldrich Plchot, Johan Rohdin, Anna Silnova, Lukás Burget, Jan Cernocký |
Training speaker embedding extractors using multi-speaker audio with unknown speaker boundaries. |
INTERSPEECH |
2022 |
DBLP DOI BibTeX RDF |
|
24 | Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari |
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis. |
CoRR |
2019 |
DBLP BibTeX RDF |
|
24 | Rohan Kumar Das, Jichen Yang, Haizhou Li 0001 |
Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech. |
APSIPA |
2019 |
DBLP DOI BibTeX RDF |
|
24 | Sunit Sivasankaran, Emmanuel Vincent 0001, Dominique Fohr |
Keyword Based Speaker Localization: Localizing a Target Speaker in a Multi-speaker Environment. |
INTERSPEECH |
2018 |
DBLP DOI BibTeX RDF |
|
24 | Angela Quinlan, Futoshi Asano |
Tracking a varying number of speakers using particle filtering. |
ICASSP |
2008 |
DBLP DOI BibTeX RDF |
|
21 | Alistair Knott, Ian Bayard, Peter Vlugter |
Multi-agent Human-Machine Dialogue: Issues in Dialogue Management and Referring Expression Semantics. |
PRICAI |
2004 |
DBLP DOI BibTeX RDF |
|
19 | Tao Liu, Zhengyang Chen, Yanmin Qian, Kai Yu 0004 |
Multi-Speaker End-to-End Multi-Modal Speaker Diarization System for the MISP 2022 Challenge. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
19 | Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie 0001, Guoqiao Yu, Guanglu Wan |
Multi-speaker Multi-style Text-to-speech Synthesis with Single-speaker Single-style Training Data Scenarios. |
ISCSLP |
2022 |
DBLP DOI BibTeX RDF |
|
19 | Chiang-Jen Peng, Yun-Ju Chan, Cheng Yu, Syu-Siang Wang, Yu Tsao 0001, Tai-Shih Chi |
Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
19 | Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie 0001, Guoqiao Yu, Guanglu Wan |
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
19 | Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-Chun Hsu, Hung-yi Lee |
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
19 | Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-Chun Hsu, Hung-yi Lee |
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech. |
ICASSP |
2021 |
DBLP DOI BibTeX RDF |
|
19 | Chiang-Jen Peng, Yun-Ju Chan, Cheng Yu, Syu-Siang Wang, Yu Tsao 0001, Tai-Shih Chi |
Attention-Based Multi-Task Learning for Speech-Enhancement and Speaker-Identification in Multi-Speaker Dialogue Scenario. |
ISCAS |
2021 |
DBLP DOI BibTeX RDF |
|
19 | Zhaoyu Liu, Brian Mak |
Multi-Lingual Multi-Speaker Text-to-Speech Synthesis for Voice Cloning with Online Speaker Enrollment. |
INTERSPEECH |
2020 |
DBLP DOI BibTeX RDF |
|
19 | Hermann Hild, Alex Waibel |
Multi-speaker/speaker-independent architectures for the multi-state time delay neural network. |
ICASSP (2) |
1993 |
DBLP DOI BibTeX RDF |
|
17 | Kenichi Fujita, Atsushi Ando, Yusuke Ijima |
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
17 | Yejin Jeon, Yunsu Kim 0001, Gary Geunbae Lee |
Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
17 | Kenichi Fujita, Atsushi Ando, Yusuke Ijima |
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis. |
IEICE Trans. Inf. Syst. |
2024 |
DBLP DOI BibTeX RDF |
|
17 | Yejin Jeon, Yunsu Kim 0001, Gary Geunbae Lee |
Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations. |
AAAI |
2024 |
DBLP DOI BibTeX RDF |
|
17 | Hyungchan Yoon, Changhwan Kim, Seyun Um, Hyun-Wook Yoon, Hong-Goo Kang |
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems. |
IEEE Signal Process. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
17 | Gaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang 0029, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee 0001 |
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
17 | Jenthe Thienpondt, Nilesh Madhu, Kris Demuynck |
Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker Audio. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
17 | Chae-Woon Bang, Chanjun Chun |
Effective Zero-Shot Multi-Speaker Text-to-Speech Technique Using Information Perturbation and a Speaker Encoder. |
Sensors |
2023 |
DBLP DOI BibTeX RDF |
|
17 | Mao-Kui He, Jun Du, Qing-Feng Liu, Chin-Hui Lee 0001 |
ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding. |
IEEE ACM Trans. Audio Speech Lang. Process. |
2023 |
DBLP DOI BibTeX RDF |
|
17 | Jenthe Thienpondt, Nilesh Madhu, Kris Demuynck |
Margin-Mixup: A Method for Robust Speaker Verification In Multi-Speaker Audio. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
17 | Kurniawati Azizah, Wisnu Jatmiko |
Transfer Learning, Style Control, and Speaker Reconstruction Loss for Zero-Shot Multilingual Multi-Speaker Text-to-Speech on Low-Resource Languages. |
IEEE Access |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim |
SNAC: Speaker-Normalized Affine Coupling Layer in Flow-Based Architecture for Zero-Shot Multi-Speaker Text-to-Speech. |
IEEE Signal Process. Lett. |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Kenta Udagawa, Yuki Saito, Hiroshi Saruwatari |
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Byoung Jin Choi, Myeonghun Jeong, Minchan Kim, Sung Hwan Mun, Nam Soo Kim |
Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Ahmad Aloradi, Wolfgang Mack, Mohamed Elminshawi, Emanuël A. P. Habets |
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim |
SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Botao Zhao 0001, Xulong Zhang 0001, Jianzong Wang, Ning Cheng 0001, Jing Xiao 0006 |
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-shot Multi-speaker Text-to-Speech. |
CoRR |
2022 |
DBLP BibTeX RDF |
|
17 | Kenta Udagawa, Yuki Saito, Hiroshi Saruwatari |
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS. |
INTERSPEECH |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Srikanth Raj Chetupalli, Sriram Ganapathy |
Speaker conditioned acoustic modeling for multi-speaker conversational ASR. |
INTERSPEECH |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Dengfeng Ke, Liangjie Huang, Wenhan Yao, Ruixin Hu, Xueyin Zu, Yanlu Xie, Jinsong Zhang 0001 |
Voicifier-LN: An Novel Approach to Elevate the Speaker Similarity for General Zero-shot Multi-Speaker TTS. |
AIPR |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Botao Zhao 0001, Xulong Zhang 0001, Jianzong Wang, Ning Cheng 0001, Jing Xiao 0006 |
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech. |
ICASSP |
2022 |
DBLP DOI BibTeX RDF |
|
17 | Ahmad Aloradi, Wolfgang Mack, Mohamed Elminshawi, Emanuël A. P. Habets |
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion. |
EUSIPCO |
2022 |
DBLP BibTeX RDF |
|
17 | Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari |
Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation. |
Speech Commun. |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Pengcheng Guo, Xuankai Chang, Shinji Watanabe 0001, Lei Xie 0001 |
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
17 | Beáta Lorincz, Adriana Stan, Mircea Giurgiu |
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
17 | Midia Yousefi, John H. L. Hanse |
Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
17 | Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari |
Perceptual-Similarity-Aware Deep Speaker Representation Learning for Multi-Speaker Generative Modeling. |
IEEE ACM Trans. Audio Speech Lang. Process. |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Kenichi Fujita, Atsushi Ando, Yusuke Ijima |
Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis. |
Interspeech |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Pengcheng Guo, Xuankai Chang, Shinji Watanabe 0001, Lei Xie 0001 |
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain. |
Interspeech |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Midia Yousefi, John H. L. Hansen |
Speaker Conditioning of Acoustic Models Using Affine Transformation for Multi-Speaker Speech Recognition. |
ASRU |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Yibin Zheng, Xinhui Li, Li Lu |
Investigation of Fast and Efficient Methods for Multi-Speaker Modeling and Speaker Adaptation. |
ICASSP |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Beáta Lorincz, Adriana Stan, Mircea Giurgiu |
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis. |
KES |
2021 |
DBLP DOI BibTeX RDF |
|
17 | Yanpei Shi |
Improving the robustness of speaker recognition in noise and multi-speaker conditions using deep neural networks |
|
2021 |
RDF |
|
17 | Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, Yuri Y. Khokhlov, Mariya Korenevskaya, Ivan Sorokin, Tatiana Timofeeva, Anton Mitrofanov, Andrei Andrusenko, Ivan Podluzhny, Aleksandr Laptev, Aleksei Romanenko |
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario. |
CoRR |
2020 |
DBLP BibTeX RDF |
|
17 | Pilar Oplustil Gallegos, Jennifer Williams 0001, Joanna Rownicka, Simon King 0001 |
An Unsupervised Method to Select a Speaker Subset from Large Multi-Speaker Speech Synthesis Datasets. |
INTERSPEECH |
2020 |
DBLP DOI BibTeX RDF |
|
17 | Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, Yuri Y. Khokhlov, Mariya Korenevskaya, Ivan Sorokin, Tatiana Timofeeva, Anton Mitrofanov, Andrei Andrusenko, Ivan Podluzhny, Aleksandr Laptev, Aleksei Romanenko |
Target-Speaker Voice Activity Detection: A Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario. |
INTERSPEECH |
2020 |
DBLP DOI BibTeX RDF |
|
17 | Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Junichi Yamagishi |
Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS? |
INTERSPEECH |
2020 |
DBLP DOI BibTeX RDF |
|
17 | Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang 0037, Nanxin Chen, Junichi Yamagishi |
Zero-Shot Multi-Speaker Text-To-Speech with State-Of-The-Art Neural Speaker Embeddings. |
ICASSP |
2020 |
DBLP DOI BibTeX RDF |
|
17 | Shuofeng Zhao, Pengyuan Liu |
DVDGCN: Modeling Both Context-Static and Speaker-Dynamic Graph for Emotion Recognition in Multi-speaker Conversations. |
NLPCC (1) |
2020 |
DBLP DOI BibTeX RDF |
|
17 | Mingrui Yuan, Zhiyao Duan |
Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech Synthesis. |
CoRR |
2019 |
DBLP BibTeX RDF |
|
17 | Hieu-Thi Luong, Xin Wang 0037, Junichi Yamagishi, Nobuyuki Nishizawa |
Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora. |
CoRR |
2019 |
DBLP BibTeX RDF |
|
17 | Pavel Denisov, Ngoc Thang Vu |
End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning. |
CoRR |
2019 |
DBLP BibTeX RDF |
|
17 | Mengnan Chen, Minchuan Chen, Shuang Liang, Jun Ma 0018, Lei Chen, Shaojun Wang, Jing Xiao 0006 |
Cross-Lingual, Multi-Speaker Text-To-Speech Synthesis Using Neural Speaker Embedding. |
INTERSPEECH |
2019 |
DBLP DOI BibTeX RDF |
|
17 | Pavel Denisov, Ngoc Thang Vu |
End-to-End Multi-Speaker Speech Recognition Using Speaker Embeddings and Transfer Learning. |
INTERSPEECH |
2019 |
DBLP DOI BibTeX RDF |
|
17 | Hieu-Thi Luong, Xin Wang 0037, Junichi Yamagishi, Nobuyuki Nishizawa |
Training Multi-Speaker Neural Text-to-Speech Systems Using Speaker-Imbalanced Speech Corpora. |
INTERSPEECH |
2019 |
DBLP DOI BibTeX RDF |
|
17 | Ruibo Fu, Jianhua Tao 0001, Zhengqi Wen, Yibin Zheng |
Phoneme Dependent Speaker Embedding and Model Factorization for Multi-speaker Speech Synthesis and Adaptation. |
ICASSP |
2019 |
DBLP DOI BibTeX RDF |
|
17 | David Snyder, Daniel Garcia-Romero, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur |
Speaker Recognition for Multi-speaker Conversations Using X-vectors. |
ICASSP |
2019 |
DBLP DOI BibTeX RDF |
|
17 | Junmo Lee, Kwang-Sub Song, Kyoung Jin Noh, Tae-Jun Park, Joon-Hyuk Chang |
DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding. |
ICEIC |
2019 |
DBLP DOI BibTeX RDF |
|
17 | Dong Zhang, Liangqing Wu, Changlong Sun, Shoushan Li, Qiaoming Zhu, Guodong Zhou |
Modeling both Context- and Speaker-Sensitive Dependence for Emotion Detection in Multi-speaker Conversations. |
IJCAI |
2019 |
DBLP DOI BibTeX RDF |
|
17 | Yan Deng, Lei He 0005, Frank K. Soong |
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice. |
CoRR |
2018 |
DBLP BibTeX RDF |
|
17 | Gregory Sell, Alan McCree |
Multi-speaker conversations, cross-talk, and diarization for speaker recognition. |
ICASSP |
2017 |
DBLP DOI BibTeX RDF |
|
17 | Yuchen Fan, Yao Qian, Frank K. Soong, Lei He 0005 |
Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis. |
ICASSP |
2015 |
DBLP DOI BibTeX RDF |
|
17 | Langzhou Chen, Norbert Braunschweiler |
Unsupervised speaker and expression factorization for multi-speaker expressive synthesis of ebooks. |
INTERSPEECH |
2013 |
DBLP DOI BibTeX RDF |
|
17 | Stéphane Rossignol, Olivier Pietquin |
Single-speaker/multi-speaker co-channel speech classification. |
INTERSPEECH |
2010 |
DBLP DOI BibTeX RDF |
|
17 | S. Masoud Mirrezaie, Seyed Mohammad Ahadi |
Speaker diarization in a multi-speaker environment using particle swarm optimization and mutual information. |
ICME |
2008 |
DBLP DOI BibTeX RDF |
|
17 | Sadao Hiroya, Takemi Mochida |
Multi-speaker articulatory trajectory formation based on speaker-independent articulatory HMMs. |
Speech Commun. |
2006 |
DBLP DOI BibTeX RDF |
|
17 | Jean-François Bonastre, Sylvain Meignier, Téva Merlin |
Speaker detection using multi-speaker audio files for both enrollment and test. |
ICASSP (2) |
2003 |
DBLP DOI BibTeX RDF |
|
17 | Alvin F. Martin, Mark A. Przybocki |
Speaker recognition in a multi-speaker environment. |
INTERSPEECH |
2001 |
DBLP DOI BibTeX RDF |
|
17 | Jack McLaughlin, Douglas A. Reynolds, Elliot Singer, Gerald C. O'Leary |
Automatic speaker clustering from multi-speaker utterances. |
ICASSP |
1999 |
DBLP DOI BibTeX RDF |
|
14 | Darío Maravall Gómez-Allende, Juan Rios, Margarita Pérez-Castellanos, A. Carpintero, J. Gómez-Calcerrada |
Comparison of Neural Networks and Conventional Techniques for Automatic Recognition of a Multilingual Speech Database. |
IWANN |
1991 |
DBLP DOI BibTeX RDF |
|
13 | Anton Ratnarajah, Shi-Xiong Zhang, Dong Yu 0001 |
M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
13 | Sung Jun Cheon, Byoung Jin Choi, Minchan Kim, Hyeonseung Lee, Nam Soo Kim |
A Controllable Multi-Lingual Multi-Speaker Multi-Style Text-to-Speech Synthesis With Multivariate Information Minimization. |
IEEE Signal Process. Lett. |
2022 |
DBLP DOI BibTeX RDF |
|
13 | Fan Yu, Shiliang Zhang, Pengcheng Guo, Yuhao Liang, Zhihao Du, Yuxiao Lin, Lei Xie 0001 |
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
13 | Fan Yu, Shiliang Zhang, Pengcheng Guo, Yuhao Liang, Zhihao Du, Yuxiao Lin, Lei Xie 0001 |
MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario. |
SLT |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Akshit Arora, Rohan Badlani, Sungwon Kim, Rafael Valle, Bryan Catanzaro |
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
11 | Myeonghun Jeong, Minchan Kim, Byoung Jin Choi, Jaesam Yoon, Won Jang, Nam Soo Kim |
Transfer Learning for Low-Resource, Multi-Lingual, and Zero-Shot Multi-Speaker Text-to-Speech. |
IEEE ACM Trans. Audio Speech Lang. Process. |
2024 |
DBLP DOI BibTeX RDF |
|
11 | Mingyang Zhang 0003, Xuehao Zhou, Zhizheng Wu 0001, Haizhou Li 0001 |
Towards Zero-Shot Multi-Speaker Multi-Accent Text-to-Speech Synthesis. |
IEEE Signal Process. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chenpeng Du, Yiwei Guo, Feiyu Shen, Kai Yu 0004 |
Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Heyang Xue, Shuai Guo, Pengcheng Zhu 0004, Mengxiao Bi |
Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Chenpeng Du, Yiwei Guo, Feiyu Shen, Kai Yu 0004 |
Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Abhayjeet Singh, Amala Nagireddi, Deekshitha G, Jesuraja Bandekar, Roopa R., Sandhya Badiger, Sathvik Udupa, Prasanta Kumar Ghosh, Hema A. Murthy, Heiga Zen, Pranaw Kumar, Kamal Kant, Amol Bole, Bira Chandra Singh, Keiichi Tokuda, Mark Hasegawa-Johnson, Philipp Olbrich |
Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Giridhar Pamisetty, Sahukari Chaitanya Varun, K. Sri Rama Murty |
Lightweight Prosody-TTS for Multi-Lingual Multi-Speaker Scenario. |
ICASSP |
2023 |
DBLP DOI BibTeX RDF |
|
11 | Yusuke Nakai, Yuki Saito, Kenta Udagawa, Hiroshi Saruwatari |
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Wei Song, Yanghao Yue, Ya-Jie Zhang, Zhengchen Zhang, Youzheng Wu, Xiaodong He 0001 |
Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement. |
CoRR |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Yiwen Shao, Shi-Xiong Zhang, Dong Yu 0001 |
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. |
ICASSP |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Tobias Cord-Landwehr, Thilo von Neumann, Christoph Böddeker, Reinhold Haeb-Umbach |
MMS-MSG: A Multi-Purpose Multi-Speaker Mixture Signal Generator. |
IWAENC |
2022 |
DBLP DOI BibTeX RDF |
|
11 | Yiwen Shao, Shi-Xiong Zhang, Dong Yu 0001 |
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. |
CoRR |
2021 |
DBLP BibTeX RDF |
|
11 | Dipjyoti Paul, Sankar Mukherjee, Yannis Pantazis, Yannis Stylianou |
A Universal Multi-Speaker Multi-Style Text-to-Speech via Disentangled Representation Learning Based on Rényi Divergence Minimization. |
Interspeech |
2021 |
DBLP DOI BibTeX RDF |
|
Displaying result #1 - #100 of 381 (100 per page; Change: ) Pages: [ 1][ 2][ 3][ 4][ >>] |
|