Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/8325
標題: 在特定文字語者驗證系統之雜訊消除
Noise Reduction for Text Dependent Speaker Verification System
作者: 李安基
Li, An-Chi
關鍵字: Cepstral Mean Subtraction (CMS);通道雜訊消除;Cepstral Weighting(CW);倒頻譜加權
出版社: 電機工程學系所
引用: [1] L. Rabiner, and B.H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993 [2] T.T. Phan and T. Soong “Text-Independent Speaker Identification” December 8, 1999 [3] C.T Heieh, E. Lai and Y.C. Wang “Robust Speaker Identification System Based on Wavelet Transform and Gaussian Mixture Model,” Journal of Information Science and Engineering 19, 267-282(2003) [4] Q.L. Augustine Tsai, and W.G. Kim, “A Language Independent Personal Voice Controller with Embedded Speaker Verification,” In 6th European Conf. Speech Communication & Technology Proc., Budapest, Hungary, vol. 3,pp. 1207-1210, Sept. 1999. [5] M. Stengel, “Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks,” Toyohashi, 441-8580 Japan March 7th, 2003 [6] D.A. Reynolds, A Gaussian Mixture Modeling Approach to Text-Independent Speaker Identification, Ph.D. Thesis, Georgia Institute of Technology, Atlanta, GA, 1992 [7] D.A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Communication, vol. 17, pp. 91-108, Aug. 1995 [8] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Processing, col. ASSP-26, no. 1,pp. 43-49, 1978. [9] A. Higgins, L. Bhaler, and J. Porter, “Voice identification using randomized phrase prompting,” Digital Signal Processing, vol.1, no. 2, pp. 89-106, 1991. [10] F. Soong, A. Rosenberg, L. Rabiner, and B. Juang, “A Vector Quantization Approach to Speaker Recognition,” Proc. Int. Conf. Acoustics, Speech, and Signal Processing, vol. 1, pp.387-390, Tampa, FL, 1985 [11] R. J. Mammone, X. Zhang and R. P. Ramachandran, “Robust speaker recognition: A feature based approach,” IEEE Signal Processing Mag., vol. 13, pp.58-71, 1996. [12] Z. X. Yuan, B.L. Xu and C. Z. Yu, “Binary quantization of feature vectors for robust text-independent speaker identification,” IEEE Tran. On Speech and Audio Processing, vol.7, no. 1, Jan. 1990 [13] C. Kermorvant “A comparison of noise reduction techniques for robust speech recognition” IDIAP-RR 99-10 [14] D. A. Reynolds, Member, IEEE, and Richard C. Rose, Member, IEEE “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models.” [15] B.H. Juang L.R. Rabiner, and J.G. Wilpon, “On the Use Bandpass Filtering in Speech Recognition,” IEEE Trans. Acoustics, Speech, And Signal Processing, vol. 35, No.7, pp.947-954, July 1987. [16] 陳明熒,”PC電腦語音辨識實作”,旗標出版,民83,台北市 [17] J.P. Campbell, “Speaker Recognition: A Tutorial, ”Proc. IEEE, vol. 85, no. 9, pp.1437-1462, Sept. 1997 [18] R. Jang (張智星) http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp Audio Signal Processing and Recognition (音訊處理與辨識) [19] 王小川,”語音訊號處理”,全華出版,2005年2月 [20] G., Ben and N. Morgan. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley and Sons, Inc: New York. 2000. [21] 楊鎮光,” Visual Basic與語音辨識”,松崗出版,pp3-34-36,2002年6月 [22] J.G. Rodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In Gmm Speaker Recognition System for Noisy and reverberant Speech with Low Complexity Microphone Arrays” [23] A. Acero and X. Huang “Augmented Cepstral Normalization for Robust Speech Recognition” [24] IMAI S.: Cepstral analysis on the mel frequency scale. –In: Proceedings ICCASSP-83,1983, pp.93-96 [25] Z. Tychtl and J. Psutka, “Speech Production Based on the Mel-Frequency Cepstral Coefficients,” No. VS 97159, and by the Grant gency of the Czech Republic-project No.102/96/K087. [26] S.B. Davies and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. ASSP-28, no. 4, 99. 357-366,Aug. 1980. [27] H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” IEEE, pp.117-120,2001. [28] Rabiner,L. and B.H. Juang, “Fundamentals of Speech Recongnition”, Prentice-Hall,1993
摘要: 
語者驗證系統的發展已經漸趨成熟,而其運用的範圍也愈來愈大,其中在辨識率的提升一直是語音辨識所極需發展的重點。在本論文當中,將從降低聲音的雜訊中來提升辨識率。首先,我們經由前置處理將我們的聲音樣本去除多餘不需要的資訊,再來我們利用線性預測參數和它的一階倒頻譜(LPCC)與梅爾倒頻譜參數(MFCC)合併,取出我們需要的聲音特徵值。
接下來我們提出了兩個降低聲音雜訊的辦法,一個是通道雜訊消除(Cepstral Mean Subtraction ,CMS),一個則是倒頻譜加權(Cepstral Weighting ,CW),將這兩個程式加入到LPCC與MFCC之中,去除得到聲音特徵值當中的雜訊。最後比較處理後的特徵值與未處理的特徵值在使用隱藏式馬可夫模型後,系統效能的優劣。

The development of speaker verification system become maturely and its application become extension of the scope. To raise the recognition rate is the key point of the speech recognition. In this thesis, we use many noise reduce methods to reduce the noise of speech and to raise the recognition rate.
Two major methods were need in thesis to reduce the noise for test dependent speaker verification. The speaker verification experiment was conducted. The speech signals were taken from the MMLab database, NCHU. 100 speaker(50 males, 50 females) were need in the test. The tests show that using cepstral mean subtraction(CMS) noise reduction method can effectively increase the speaker verification rate. Adding the cepstral weighting(CW) noise reduction method can improve the verification performance.
URI: http://hdl.handle.net/11455/8325
其他識別: U0005-2307200812463000
Appears in Collections:電機工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.