Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/96611
標題: 利用KNN局部平均法於不分聲調之中文母音辨識
Using the method of local mean to recognize Mandarin vowel without tones
作者: 張育昊
Yu-Hao Chang
關鍵字: 改進局部平均法;K最近鄰居法;K-means演算法;梅爾倒頻譜係數;Improved local-mean method;K-nearest neighbor method;K-means method;Mel-frequency cepstrum coefficient
引用: [1] M. Mohri, F. Pereira and M. Riley (2002),'Weighted finite-state transducers in speech recognition,' Computer Speech and Language, vol. 20, no. 1, pp. 69-88. [2] A.M. de Lima Araújo and F. Violaro (1998),'Formant frequency estimation using a Mel scale Lpc algorithm,' in: Proceedings of Telecommunications Symposium 1, pp. 207-212. [3] A. Ahad, A. Fayyaz and T. Mehmood(2002),'Speech recognition using multilayer perceptron,' in Proc. of the IEEE Conference ISCON'02, vol. 1, pp. 103-109. [4] S. Masmoudi, M. Chtourou and A. B. Hamida(2009),'Isolated word recognition system using MLP neural network constructive tranning algorithm,' Systems, Signals and Devices. SSD '09. 6th International Multi-Conference on , vol., no., pp.1,6, 23-26. [5] B. Zhang and H. Pan(2013),'Reliable classification of vehicle logos by an improved local-mean based classifier,' In Proc. International Congress on Image and Signal Processing, pp.17-180. [6] 王小川(2004),'語音訊號處理'。台北市:全華。 [7] 王國榮(2000),'Visual Basic 6.0 實戰講座'。台北市:旗標。
摘要: 
本篇論文主要探討不同語者對於1391個中文單音的母音辨識,辨識流程主要有三個步驟,第一個步驟為前處理,將錄製好的聲音進行數位取樣、常態化、端點偵測、切割音框、預強調、視窗化,第二個步驟為特徵求取,將前處理完的語音訊號經過離散傅立葉轉換、三角濾波器、頻率範圍、對數能量、離散餘弦轉換得到梅爾倒頻譜係數,第三步驟將得到的梅爾倒頻譜係數使用K最近鄰居法、K-means分群法與改進局部平均法分別進行辨識,最後得到的最高辨識率為使用改進局部平均法的92.74%的辨識率,最低的辨識率為使用K-means分群法的89.13%。

The aim of this paper is to discuss the recognition of mandarin vowel using 1391 mandarin consonant words by different speakers. The recognition process can mainly separate into three parts. First, we make the vocal data doing fore-process, analog signal to digital signal、normallize、endpoint detecting、frame cutting、pre-emphasis and windowing. Second, we use discrete Fourier transform、triangular bandpass filters、frequency range、log energy、discrete cosine transform to get the Mel-scale frequency cepstral coefficients(MFCC). Third, we use KNN、K-means and improved local mean, three method to recognize. Finally, we got highest recognition rate with 92.74% by using improved local mean method and lowest recognition rate with 89.13% by using K-means algorithm.
URI: http://hdl.handle.net/11455/96611
Rights: 同意授權瀏覽/列印電子全文服務,2017-07-19起公開。
Appears in Collections:統計學研究所

Files in This Item:
File SizeFormat Existing users please Login
nchu-106-7104018028-1.pdf11.73 MBAdobe PDFThis file is only available in the university internal network    Request a copy
Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.