Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/4859
標題: 混合式端點偵測應用於特定文字語者之驗證系統
The Use of Mixture Endpoint Detection Technique for Text-Dependent Speaker Verification System
作者: 管浩延
Yan, Kuan-Hao
關鍵字: mixture detection;熵;entropy;zero-crossing;過零率;切音
出版社: 通訊工程研究所
引用: References [1] J. M. Naik, “Speaker Verification: A Tutorial,” IEEE Communication Magazine, 28, 1, pp.42-48 (1 990) [2] F. Bimbot, J. F. Bonastre, C. Fredouille, G. Gravier, M. C. Ivan, S. Meignier, T. Merlin, O. G. Javier, P. D. Dijana, and D. A. Reynolds, “A Tutorial on Text-independent Speaker Verification,” EURASIP Journal on Applied Signal Processing 2004:4, pp. 430-451, 2004. [3] L. Rabiner, and B. H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993. [4] J. P. Campbell, Jr, “Speaker Recognition: A Tutorial,” IEEE Invited Paper, Proceedings of The IEEE, Vol. 85, No. 9, pp. 1-26, September 1997. [5] T. F. Quatieri, and Massachusetts Institute of Technology Lincoln Laboratory, “Discrete-Time Speech Signal Processing Principles and Practice,” Pearson Education Taiwan Ltd, 2005. [6] B. R. Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001. [7] B. R. Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001. [8] D. S. Reynold and R. C. Rose “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995. [9] KSR Murty and B. Yegnanarayana, ”Combining Evidence From Residual Phase and MFCC Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 13, No. 1, pp. 52-55, January 2006. [10] A. Mezghani, and D. O’Shaughnessy, “Speaker Verification Using a New Representation Based on a Combination of MFCC and Formants,” CCECE/CCGEI, Saskatoon, pp. 1461-1464, May 2005. [11] K. Chen, Senior Member, IEEE, “On the Use of Different Speech Representations for Speaker Modeling,” IEEE Transactions on Systems, MAN, and Cybernetics-Part C: Applications and Reviews, Vol. 35, No. 3, pp. 301-314, August 2005. [12] S. Haykin, “Communication Systems 4th Edition,” John Wiley & Sons, Inc., 2001. [13] J. L. Shen, J. W. Hung, and L. S. Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments”, Int. Conf. on Spoken Lang. Processing, 1998, pp.1-4 [14] Q. Li, Senior Member, IEEE, J. Zheng, A. Tsai, and Q. Z., Member, IEEE, “Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition,” IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 3, pp. 146-157, March 2002. [15] Y. Linde, A. Buzo and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., Vol. COM 28, pp. 84-95, Jan. 1980. [16] J. GRodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In GMM Speaker Recognition System for noisy and reverberant Speech eith Low complexity Microphone Arrays” [17] A. Acero and X. Huang “Augmented Cepstral Normalization fo Robust Speech Recognition”. [18] M. Stengel, “ Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks, ” Yoyohoshi, 441-8580 Japan March 7th, 2003. [19] 楊鎮光,” Visual Basic 語音辨識”,松崗出版,pp3-34-36,2002年6月 [20] H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” Proc. ICASSP, vol. 1, pp. 117–120, 2001. [21] J. L. Shen, J. W. Hung, and L. S. Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments”, Int. Conf. on Spoken Lang. Processing, 1998, pp.1-4 [22] R. Jang (張智星) neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp Audio Signal Processing and Recognition (語音處理與辨識) [23] 王小川,”語音訊號處理”,全華出版,2005年2月 [24] D. S. Reynold and R. C. Rose “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995. [25] A. Martin, G. Doddington, T. Kamm, M. Ordowski, M. Przybocki, “The DET Curve in Assessment of Detection Task Performance,” IEEE. [26] H. Matsumoto and M. Moroto, “Evalution of Mel-LPC Cepstrum in A Large Vocabulary continuous Speech Recognition,”IEEE,pp.117-120,2001 [27] L. Rabiner,. and B. H. Juang, “Fundamentals of Speech Recognition” Prenrice-Hall, 1993. [28] 謝忠穎,”An Improved Speaker Verification System Using Orthogonal GMM”,National Chung Hsing University 2006”
摘要: 
近年來,隨著語音辨識德成熟與進步,加上其應用的範圍也逐漸擴大,辨識率的提升也一直是語音辨識上的重點。在本論文中,我們是利用切音的方法來找出正確的語音來提升辨識的效果。首先,我們經由前置處理將我們的聲音樣本去除多餘不需要的資訊,再來我們利用線性預估參數(LPCC)與梅爾倒頻譜參數(MFCC)合併,已取出我們需要的聲音特徵值。
接下來我們提出了一種混合式的切音方法,就是將原本已經應用的熵(entropy)跟過零率 (zero-crossing rate)做結合,在有雜訊的條件下,找出真正的有要與音片段;而在這有效語音的片斷下,我們也同時再加上使用過零率(zero-crossing rate)來找出語音中的氣音部分。使用此混合式切音法將可有效的改善基於特定文字語音驗證系統的效能。

Speaker verification has been used in the area of biometric authentication. The recognition rate is the key issue for recent development of speech recognition. In this thesis, instead of the traditional endpoint detection method, we have developed a new mixture endpoint detection method to increase the recognition rate of the speaker verification. We adopt entropy and zero-crossing for end point detection to detect a real speech sections. We use this technique to locate a real speech section in noisy data, and then use the zero-crossing to detect an air sound from the speech. After these processes, the SNR can really reflect the real speech level; therefore the threshold can be set. Using the mixture endpoint detection technique can easily increase the text-dependent speaker verification system efficiency.
URI: http://hdl.handle.net/11455/4859
其他識別: U0005-0308200916322400
Appears in Collections:通訊工程研究所

Show full item record
 
TAIR Related Article

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.