Please use this identifier to cite or link to this item:
Voice Feature Classification for Voice Verification
Linear predictive coding (LPC)
Mel-Frequency Cepstral Coefficients (MFCC)
Fuzzy C-means clustering
Fuzzy subtractive clustering
cepstral mean and variance normalization (CMVN)
|引用:|| J. M. Naik, “Speaker Verification: A Tutorial,” IEEE Communication Magazine, 28, 1, pp.42-48 (1 990)  F. Bimbot, J. F. Bonastre, C. Fredouille, G. Gravier, M. C. Ivan, S. Meignier, T. Merlin, O. G. Javier, P. D. Dijana, and D. A. Reynolds, “A Tutorial on Text-independent Speaker Verification,” EURASIP Journal on Applied Signal Processing 2004:4, pp. 430-451, 2004.  L. Rabiner, and B. H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993.  J. P. Campbell, Jr, “Speaker Recognition: A Tutorial,” IEEE Invited Paper, Proceedings of The IEEE, Vol. 85, No. 9, pp. 1-26, September 1997.  T. F. Quatieri, and Massachusetts Institute of Technology Lincoln Laboratory, “Discrete-Time Speech Signal Processing Principles and Practice,” Pearson Education Taiwan Ltd, 2005.  B. R. Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001.  B. R. Wildermoth, “Text-independent Speaker Recognition Using Source Based Features,” Master of Philosophy, Griffith University, Australia, January 2001.  D. S. Reynold and R. C. Rose “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995.  KSR Murty and B. Yegnanarayana, ”Combining Evidence From Residual Phase and MFCC Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 13, No. 1, pp. 52-55, January 2006.  A. Mezghani, and D. O’Shaughnessy, “Speaker Verification Using a New Representation Based on a Combination of MFCC and Formants,” CCECE/CCGEI, Saskatoon, pp. 1461-1464, May 2005.  K. Chen, Senior Member, IEEE, “On the Use of Different Speech Representations for Speaker Modeling,” IEEE Transactions on Systems, MAN, and Cybernetics-Part C: Applications and Reviews, Vol. 35, No. 3, pp. 301-314, August 2005.  S. Haykin, “Communication Systems 4th Edition,” John Wiley & Sons, Inc., 2001.  J.L.Shen, J.W.hung, and L.S.Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments”, Int. Conf. on Spoken Lang. Processing, 1998, pp.1-4  Q. Li, Senior Member, IEEE, J. Zheng, A. Tsai, and Q. Z., Member, IEEE, “Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition,” IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 3, pp. 146-157, March 2002.  Y. Linde, A. Buzo and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., Vol. COM 28, pp. 84-95, Jan. 1980.  J. GRodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In GMM Speaker Recognition System for noisy and reverberant Speech eith Low complexity Microphone Arrays”  A.Acero and X.Huang “Augmented Cepstral Normalization fo Robust Speech Recognition”.  M. Stengel, “ Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks, ” Yoyohoshi, 441-8580 Japan March 7th, 2003.  楊鎮光，” Visual Basic 語音辨識”，松崗出版，pp3-34-36，2002 年6 月  H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” Proc. ICASSP, vol. 1, pp. 117–120, 2001.  J.L.Shen, J.W.hung, and L.S.Lee, “Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments”, Int. Conf. on Spoken Lang. Processing, 1998, pp.1-4  R.Jang (張智星) neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp Audio Signal Processing and Recognition (語音處理與辨識)  王小川，”語音訊號處理”，全華出版，2005 年2 月  D. S. Reynold and R. C. Rose “Robust Test-independent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995.  A. Martin, G. Doddington, T. Kamm, M. Ordowski, M. Przybocki, “The DET Curve in Assessment of Detection Task Performance,” IEEE.  H. Matsumoto and M. Moroto, “Evalution of Mel-LPC Cepstrum in A Large Vocabulary continuous Speech Recognition,”IEEE,pp.117-120,2001  Rabiner, L. and B.H. Juang, “Fundamentals of Speech Recognition” Prenrice-Hall, 1993.  謝忠穎，”An Improved Speaker Verification System Using Orthogonal GMM， National Chung Hsing University 2006”  Khaled Hammouda and Fakhreddine Karray, “A Comparative Study of Data Clustering Techniques”, Canada  Andrew Moore, “K-means and Hierarchical Clustering - Tutorial Slides”|
在本文中，我們利用線性預估參數(Linear Predictive Coding, LPC)與其一階倒頻譜(Delta-cepstral Coefficients)，再與梅爾倒頻譜參數(Mel-frequency Cepstral Coefficients, MFCC)結合成為新的一組特徵值。然而我們使用了K-means 分群法(K-means clustering)、模糊C-means 分群法(Fuzzy C-means clustering)與模糊減法分群法(Fuzzy subtractive clustering) 對特徵值找出最佳平均值與變異數，再使用倒頻譜平均值與變異數正規化法(Cepstral mean and variance normalization, CMVN)以抵抗雜訊。|
Speech identification system has expanded the scope of applications, and the relative accuracy of the system is also taken seriously. This thesis presents several different kinds of classifications for speech features to strengthen the system's accuracy, and compare the performance of different classifications. In this thesis, we combined with linear prediction coding (LPC) and its first-order cepstrum (delta-LPC) and Mel-Frequency Cepstral Coefficients (MFCCs) to form a new set of above features. However, we used the K-means clustering, Fuzzy C-means clustering and Fuzzy subtractive clustering to find the individual mean and variance for every frame, then used the cepstral mean and variance normalization (CMVN) to reduce the influence of noise.
|Appears in Collections:||通訊工程研究所|
Show full item record
TAIR Related Article
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.