Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/7709
標題: 雙特徵值應用於特定文字之語者驗證系統
Double Feature Extraction for Text Dependent Speaker Verification System
作者: 張勝鈞
Chang, Sheng-Jyun
關鍵字: Linear predictive coding(LPC);線性預測參數;Mel-Frequency Cepstral Coefficients (MFCC);Hidden Markov Model(HMM);梅爾倒頻譜參數;隱藏式馬可夫模型
出版社: 電機工程學系所
引用: 參考文獻 [1] L. Rabiner, and B.H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993 [2] T.T. Phan and T. Soong “Text-Independent Speaker Identification” December 8, 1999 [3] C.T Heieh, E. Lai and Y.C. Wang “Robust Speaker Identification System Based on Wavelet Transform and Gaussian Mixture Model,” Journal of Information Science and Engineering 19, 267-282(2003) [4] Q.L. Augustine Tsai, and W.G. Kim, “A Language Independent Personal Voice Controller with Embedded Speaker Verification,” In 6th European Conf. Speech Communication & Technology Proc., Budapest, Hungary, vol. 3,pp. 1207-1210, Sept. 1999. [5] M. Stengel, “Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks,” Toyohashi, 441-8580 Japan March 7th, 2003 [6] D.A. Reynolds, A Gaussian Mixture Modeling Approach to Text-Independent Speaker Identification, Ph.D. Thesis, Georgia Institute of Technology, Atlanta, GA, 1992 [7] D.A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Communication, vol. 17, pp. 91-108, Aug. 1995 [8] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Processing, col. ASSP-26, no. 1,pp. 43-49, 1978. [9] A. Higgins, L. Bhaler, and J. Porter, “Voice identification using randomized phrase prompting,” Digital Signal Processing, vol.1, no. 2, pp. 89-106, 1991. [10] F. Soong, A. Rosenberg, L. Rabiner, and B. Juang, “A Vector Quantization Approach to Speaker Recognition,” Proc. Int. Conf. Acoustics, Speech, and Signal Processing, vol. 1, pp.387-390, Tampa, FL, 1985 [11] R. J. Mammone, X. Zhang and R. P. Ramachandran, “Robust speaker recognition: A feature based approach,” IEEE Signal Processing Mag., vol. 13, pp.58-71, 1996. [12] Z. X. Yuan, B.L. Xu and C. Z. Yu, “Binary quantization of feature vectors for robust text-independent speaker identification,” IEEE Tran. On Speech and Audio Processing, vol.7, no. 1, Jan. 1990 [13] C. Kermorvant “A comparison of noise reduction techniques for robust speech recognition” IDIAP-RR 99-10 [14] D. A. Reynolds, Member, IEEE, and Richard C. Rose, Member, IEEE “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models.” [15] B.H. Juang L.R. Rabiner, and J.G. Wilpon, “On the Use Bandpass Filtering in Speech Recognition,” IEEE Trans. Acoustics, Speech, And Signal Processing, vol. 35, No.7, pp.947-954, July 1987. [16] 陳明熒,”PC電腦語音辨識實作”,旗標出版,民83,台北市 [17] J.P. Campbell, “Speaker Recognition: A Tutorial, ”Proc. IEEE, vol. 85, no. 9, pp.1437-1462, Sept. 1997 [18] R. Jang (張智星) http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp Audio Signal Processing and Recognition (音訊處理與辨識) [19] 王小川,”語音訊號處理”,全華出版,2005年2月 [20] G., Ben and N. Morgan. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley and Sons, Inc: New York. 2000. [21] 楊鎮光,” Visual Basic與語音辨識”,松崗出版,pp3-34-36,2002年6月 [22] J.G. Rodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In Gmm Speaker Recognition System for Noisy and reverberant Speech with Low Complexity Microphone Arrays” [23] A. Acero and X. Huang “Augmented Cepstral Normalization for Robust Speech Recognition” [24] IMAI S.: Cepstral analysis on the mel frequency scale. –In: Proceedings ICCASSP-83,1983, pp.93-96 [25] Z. Tychtl and J. Psutka, “Speech Production Based on the Mel-Frequency Cepstral Coefficients,” No. VS 97159, and by the Grant gency of the Czech Republic-project No.102/96/K087. [26] S.B. Davies and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. ASSP-28, no. 4, 99. 357-366,Aug. 1980. [27] H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” IEEE, pp.117-120,2001.
摘要: 
近年來,隨著語音辨識的成熟加上應用的範圍擴大,語者驗證的研究越來越受到重視。本論文提出幾種不同的語音特徵值在應用本特定文字證系統上,我們並且比較不同語音特徵值的效能。首先,我們利用線性預測參數(Linear predictive coding ,LPC)和它的一階倒頻譜(Delta-cepstral coefficients)來做語者驗證。皆下來使用另一個經過二十個三角慮波器的梅爾倒頻譜參數(MFCC)來近似整段語音的特徵值。最後我們合併兩個特徵值參數LPCC和MFCC成為一個新的特徵值,並比較三個特徵值在使用隱藏式馬可夫模型(HMM)驗證系統下的效能。

In recent years, speaker verification technique and its applications become extension of the scope and the importance of the study of speaker verification is increasing. In this thesis, we developed a combined feature extraction set and used in place of conventional LPC or MFCC feature only. The Linear Predictive Coding (LPC) and its Delta-cepstral coefficients in voice verification system have shown a good result in speaker verification. The use of Mel-Frequency Cepstral Coefficients (MFCC) that has twenty triangular filters to approximate entire speech features was also been used in speaker verification for many years. The experimental results show using the new LPCC-MFCC combined feature have better performance on text dependent speaker verification system.
URI: http://hdl.handle.net/11455/7709
其他識別: U0005-2507200715561600
Appears in Collections:電機工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.