Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/6408
標題: 正交式高斯混合模型之語者驗證系統
An Improved Speaker Verification System Using Orthogonal GMM
作者: 謝忠穎
Hsieh, Chung-Ying
關鍵字: text-independent speaker verification;非特定文字語者驗證;Gaussian mixture model;vector quantization;MFCC;LPCC;高斯混合模型;向量量化;梅爾倒頻譜係數;線性預測倒頻譜係數
出版社: 電機工程學系所
引用: [1]Jayant M. Naik, “Speaker Verification: A Tutorial,” IEEE Communication Magazine, pp.42-48, January 1990. [2]Frederic Bimbot, Jean-Francois Bonastre, Corinne Fredouille, Guillaume Gravier, Ivan Magrin-Chagnolleau, Sylvain Meignier, Teva Merlin, Javier Ortega-Garcia, Dijana Petrovska-Delacretaz, and Douglas A. Reynolds, “A Tutorial on Text-independent Speaker Verification,” EURASIP Journal on Applied Signal Processing 2004:4, pp. 430-451, 2004. [3]Lawrence Rabiner, and Biing-Hwang Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993. [4]Joseph P. Campbell, Jr, “Speaker Recognition: A Tutorial,” IEEE Invited Paper, Proceedings of The IEEE, Vol. 85, No. 9, pp. 1-26, September 1997. [5]Thomas F. Quatieri, and Massachusetts Institute of Technology Lincoln Laboratory, “Discrete-Time Speech Signal Processing Principles and Practice,” Pearson Education Taiwan Ltd, 2005. [6]Brett Richard Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001. [7]Douglas A. Reynolds, Member, IEEE, and Richard C. Rose, Member, IEEE, “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995. [8]K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE, ”Combining Evidence From Residual Phase and MFCC Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 13, No. 1, pp. 52-55, January 2006. [9]Ahmed Mezghani, and Douglas O'Shaughnessy, “Speaker Verification Using a New Representation Based on a Combination of MFCC and Formants,” CCECE/CCGEI, Saskatoon, pp. 1461-1464, May 2005. [10]Ke Chen, Senior Member, IEEE, “On the Use of Different Speech Representations for Speaker Modeling,” IEEE Transactions on Systems, MAN, and Cybernetics-Part C: Applications and Reviews, Vol. 35, No. 3, pp. 301-314, August 2005. [11]Simon Haykin, “Communication Systems 4th Edition,” John Wiley & Sons, Inc., 2001. [12]Jia-lin Shen, Jeih-weih Hung, and Lin-shen Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments,” Proceeding of ICSLP-98, 1998. [13]Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine Tsai, and Qiru Zhou, Member, IEEE, “Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition,” IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 3, pp. 146-157, March 2002. [14]Y. Linde, A. Buzo and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., Vol. COM 28, pp. 84-95, Jan. 1980. [15]J. Pelecanos, S. Myers, S. Sridharan and V. Chandran, “Vector Quantization Based Gaussian Modeling for Speaker Verification,” in Proceedings of 15th. Int. Conf. On Pattern Recognition 2000, Vol.3, pp. 294-297. [16]D.A.Reynolds, T.F.Quatieri, and R.B.Dunn, “Speaker Verification Using Adapted Gaussian Mixture Models,” IDEAL on Digital Signal Processing, pp. 19-41, Oct. 2000. [17]Y.Zhang, D.Zhang, and X.Zhu, “A Novel Text-Independent Speaker Verification Methods Based on the Global Speaker Model,” IEEE Transactions on System, Man and Cybernetics, Vol. 30, pp. 598-602, Sept. 2000. [18]Hiroshi Matsumoto and Masanori Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” IEEE, pp.117-120, 2001. [19]S. Davis and P. Mermelsein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Trans., Vol. ASSP-28, No. 4, pp. 357-366, 1980. [20]A.Acero and X.Huang, “Augmented Cepstral Normalization for Robust Speech Recognition,” Proceedings of IEEE Automatic Speech Recognition Workshop, pp. 146-147, Dec.1995. [21]R.D.Zilca, ”Text-Independent Speaker Verification Using Utterance Level Scoring and Covariance Modeling”, IEEE Transactions on Speech and Audio Processing, Vol. 10, pp. 363-370, Sept.2000. [22]L. P. Cordella, P. Foggia, C. Sansone, and M. Vento, “A Real-Time Text-Independent Speaker Identification System,” Proceedings of the 12th International Conference on Image Analysis and Processing (ICIAP'03), 2003 IEEE. [23]Gurmeet Singh, Ashish Panda, Saurav Bhattacharyya, and Thambipillai Srikanthan, “Vector Quantization Techniques for GMM Based Speaker Verification,” IEEE ICASSP, pp. 65-68, 2003. [24]Li Liu and Jialong He, “On the Use of Orthogonal GMM in Speaker Recognition,” IEEE, 1999. [25]Bimbot F., Chagnolleau I, Mathan L., “Second-Order statistical measures for Text-Independent Speaker Identification,” IEEE Signal Processing Magazine, pp. 18-32, Oct. 1994. [26]Douglas A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” ELSEVIER Speech Communication 17, pp. 91-108, 1995. [27]A. Martin, G. Doddington, T. Kamm, M. Ordowski, M. Przybocki, “The DET Curve in Assessment of Detection Task Performance,” IEEE. [28]Yu-Hong Li, and Yen-Chieh Ouyang, “An Enhanced Text-independent Speaker Verification System,” Master thesis of Department of Electrical Engineering, National Chung-Hsing University, Taichung, Taiwan, 402 R.O.C., August 2005. [29]Roger Jang's Home Page. http://neural.cs.nthu.edu.tw/jang/ [30]Hassan Ezzaidi and Jean Rouat, “Pitch and MFCC dependent GMM models for speaker identification systems,” CCECE 2004-CCGEI 2004, Niagara Falls, IEEE, pp. 43-46, May 2004. [31]Li Liu, Jialong He, and Gunther Palm, “Signal Modeling For Speaker Identification,” IEEE, 1996. [32]Herbert Gish and Michael Schmidt, “Text-Independent Speaker Identification,” IEEE Signal Processing Magazine, pp. 18-32, October 1994.
摘要: 
語者驗證在安全和犯罪監控上是一種很常見的技術。本篇論文著眼於研究以三個演算法來改良傳統特定與半特定文字語者驗證系統的效能。首先,我們利用MFCC與LPCC所混合之特徵參數來替代傳統MFCC特徵參數以期獲得更佳的語者特性表現。第二,我們以向量量化為基礎的LBG演算法來替代K-means 演算法以期獲得更快的特徵參數分群與模型訓練所需時間且將不影響系統驗證之效能。最後,我們再利用正交化高斯混合模型以期獲得與各群特徵參數之分布有最佳近似。接著,利用不同實驗來驗證上述方法將可獲取系統更佳的效能並且能夠減少若干計算量。

Speaker verification is an important technique in security and crime monitored, in this thesis, we proposed three methods to improve a traditional text-independent and text-semidependent speaker verification system. First, an MFCC-LPCC combined feature set is used in place of conventional MFCC feature. Second, VQ-based LBG algorithm is proposed to enhance the efficiency of feature clustering and model training. Lastly, we use the orthogonal GMM for well approximation to distributions of feature sets. Subsequently, experimental results demonstrate that our proposed methods are efficiency on both text-independent and text-semidependent speaker verification systems.
URI: http://hdl.handle.net/11455/6408
其他識別: U0005-1508200621115700
Appears in Collections:電機工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.