Please use this identifier to cite or link to this item:
標題: 利用K-最近鄰居法辨識中文母音及翹舌音的探討
Using the Method of K- Nearest Neighbor to Recognize Vowel of Isolated Mandarin Word and Investigation of Retroflex
作者: 吳敏男
Wu, Min-Nan
關鍵字: Vowel recognition;母音辨識率;Mel-frequency cepstrum coefficient;K-nearest neighbor;梅爾頻率倒頻譜係數;K-最近鄰居法
出版社: 統計學研究所
引用: [1] 王小川 (2004),“語音訊號處理”。台北市:全華。 [2] 王國榮 (2000),“Visual Basic 6.0 實戰講座”。台北市:旗標。 [3] 吳明哲,黃世陽 (1998),“Visual Basic 6.0 中文版學習範本”。台北市:松崗。 [4] 鍾靖爵,李宗寶 (2011),“利用共同向量法以及最佳梅爾頻率倒頻譜之特徵辨識特定語者之中文單音”。碩士論文,國立中興大學應用數學研究所,台中。 [5] 羅璟義,李宗寶 (2009),“利用權重式共同向量法於中字彙之特定語者中文單音辨識”。碩士論文,國立中興大學應用數學研究所,台中。 [6] 籃元隆,李宗寶 (2009),“利用權重式多重KNN法於中字彙之特定語者中文單音辨識”。碩士論文,國立中興大學應用數學研究所,台中。 [7] Cover, T. M. and Hart, P. E. (1967). “Nearest Neighbor Pattern Classification”, IEEE Trans. On Information Theory, vol. IT-13, No. 1, pp. 21-27. [8] Gulmezoglu, M. B., Dzhafarov, V. and Barkana, A. (1999). “A novel approach to isolated word recognition”, IEEE Trans. On Speech and Audio Processing, vol. 7, No. 6, pp. 620-627. [9] Keskin, M. Gulmezoglu, M. B. Parlaktuna, O. and Barkana, A. (1996), “Isolated word recognition by extracting personal differences”, in Proc. 6 th Int.Conf. Signal Processing Applications and Technology, Boston, MA, pp. 1989-1992. [10] Tsang-Long Pao. and Wen-Yuan Liao. and Yu-Te Chen. (2007), “Audio-Visual Speech Recognition with Weighted KNN-based Classification in Mandarin Database”, Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007. Third International Conference on, vol. 1, pp. 39-42.

This study is mainly to recognize 1391 isolated mandarin words for speaker-dependent. I divide the contents into two parts that will be discussed in the following paragraphs. The first part is the recognition of retroflex consonants. The second part is to recognize the 1391 vowel of the isolated mandarin words. We use Mel-Frequency Cepstrum Coefficient (Mfcc) and Uniform Cepstrum respectively to analyze features. Then we use the method of K-nearest neighbor (KNN) for the recognition. Five experimental factors are considered in the paper. That is, “the swing of frame”, “the number of frame”, “the length of frame”, “the dimension of frame” and "the usage of Delta- Mel-Frequency Cepstrum Coefficient”.
The experiment uses 12 groups' database including mine. Finally, I find that the best recognition rate of retroflex consonants in database is 98% and the best recognition rate of 1391 vowel of the isolated mandarin words is 90.8% in the different combinations of the parameters.
其他識別: U0005-0207201214214300
Appears in Collections:統計學研究所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.