Please use this identifier to cite or link to this item:
標題: 利用權重式第K位最鄰近方法於中字彙之特定語者中文單音辨識
Using the Method of Weighted K-NN to Recognize Isolated Word for Speaker-Dependent System
作者: 李蕙珺
Li, Hui-Chun
關鍵字: The Weighted of K-th Nearest Neighbor;權重式第k 位最接近的鄰居方法
出版社: 應用數學系所
引用: [1]. 王小川(2004)。"語音訊號處理"。臺北市:全華。 [2]. 王國榮(2000)。" Visual Basic 6.0 實戰講座"。臺北巿:旗標。 [3]. 張國清,李宗寶(2005)。“用K-means 之動態時間軸校正法於國語數字之 語音辨識"。 碩士論文,國立中興大學應用數學研究所,台中。 [4]. 林子傑,李宗寶(2007)。"利用Multiple Common Vector 及Dynamic Time Warping 法於特定語者中文單音辨識”。 碩士論文,國立中興大學應用數學研究所,台中。 [5]. 李宗寶,杜思良(2007)。 “利用共同向量法於特定語者中文單音辨識”。 碩士論文,國立中興大學應用數學研究所,台中。 [6]. 吳明哲,黃世陽。Visual Basic 6.0 中文版學習範本。臺北市:松崗,1998。[7]. Tran, T. N., Wehrens, R. and Buydens, L. M. C. (2006), “KNN-kernel density-based clustering for high-dimensional multivariate data”. Computational Statistics and Data Analysis, 51, No. 2, 513-525. [8]. Xiong Bing and Sun Yihe (1996), “Research on ASIC for multi-speaker isolated word recognition”, ASIC, 2nd International Conference, 135-137. [9]. Rabiner L.R. and Sambur M.R. (1975), “An algorithm for determining the endpoints of isolated utterances”,The Bell System Technique Journal,Vol.54, pp.297-315. [10]. Dasarathy, B.V. (1991), “Nearest Neighbour Norms: NN Pattern Classification Techniques”, IEEE Computer Society Press, Los Alamitos, CA, pp. 1–30. [11]. Hand, D.J. (1981), “Discrimination and Classification”, Wiley, NewYork, pp. 28–29
本篇論文主要是探討337個國字語音之特定語者的單音辨識,在從337個國字單音取200個作為辨識,論文中所使用的辨識方法為權重式第k位最接近的鄰居(The Weighted of K-th Nearest Neighbor,The Weighted of K-NN),實驗開始先錄至語音資料庫,對337個國字單音錄製十次,再隨機挑其中三次的語音資料當作待測語音,其他則為訓練語音。
錄製好語音資料庫後,就針對語音資料庫做語音的前置處理,其前置處理包含標準化(Normalize)、端點測試(Voice/Voiceless)、子母音音框切割(Frame Blocking)、預強調(Pre-emphasis)、視窗化(Windowing),前處理結束後,再經線性預估編碼(Linear Prediction Coding),倒頻譜編碼(Cepstrum Coding),擷取語音的特徵參數,最後為了使辨識系統穩定及迅速,再將每個語音經線性壓縮擴張法使音框數固定。

This paper discuss the speech recognition of 337 isolated mandarin words from the speaker-dependent, and we choose 200 isolated mandarin words to speech recognition. The recognition method we used in this paper is the weighted of k-th nearest neighbor (WK-NN), it’s start from record our speech database with 337 isolated mandarin words ten times, and random select three times as the testing database, others become training database.
After record speech database, we focus speech database on the pre-processing, then through the linear prediction coding、the cepstrum coding, and picking up the speech feature. In order to make the speech recognition system become stable and to be rapid, we expand and condense to fixed the frame number for the isolated mandarin words.
The experimental result is used to proceed the speaker-dependent recognition system. The rate of recoginition obtains 80.83% under 200 isolated mandarin words. Eventually, some suggestions are given to improve the rate of recognition for the future work.
其他識別: U0005-0107200903083400
Appears in Collections:應用數學系所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.