Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/7709
DC FieldValueLanguage
dc.contributor楊晴雯zh_TW
dc.contributor溫志煜zh_TW
dc.contributor.advisor歐陽彥杰zh_TW
dc.contributor.author張勝鈞zh_TW
dc.contributor.authorChang, Sheng-Jyunen_US
dc.contributor.other中興大學zh_TW
dc.date2008zh_TW
dc.date.accessioned2014-06-06T06:40:24Z-
dc.date.available2014-06-06T06:40:24Z-
dc.identifierU0005-2507200715561600zh_TW
dc.identifier.citation參考文獻 [1] L. Rabiner, and B.H. Juang, “Fundamentals of Speech Recognition,” Prentice-Hall International, Inc., 1993 [2] T.T. Phan and T. Soong “Text-Independent Speaker Identification” December 8, 1999 [3] C.T Heieh, E. Lai and Y.C. Wang “Robust Speaker Identification System Based on Wavelet Transform and Gaussian Mixture Model,” Journal of Information Science and Engineering 19, 267-282(2003) [4] Q.L. Augustine Tsai, and W.G. Kim, “A Language Independent Personal Voice Controller with Embedded Speaker Verification,” In 6th European Conf. Speech Communication & Technology Proc., Budapest, Hungary, vol. 3,pp. 1207-1210, Sept. 1999. [5] M. Stengel, “Introduction to Graphical Models, Hidden Markov Models and Bayesian Networks,” Toyohashi, 441-8580 Japan March 7th, 2003 [6] D.A. Reynolds, A Gaussian Mixture Modeling Approach to Text-Independent Speaker Identification, Ph.D. Thesis, Georgia Institute of Technology, Atlanta, GA, 1992 [7] D.A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Communication, vol. 17, pp. 91-108, Aug. 1995 [8] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Processing, col. ASSP-26, no. 1,pp. 43-49, 1978. [9] A. Higgins, L. Bhaler, and J. Porter, “Voice identification using randomized phrase prompting,” Digital Signal Processing, vol.1, no. 2, pp. 89-106, 1991. [10] F. Soong, A. Rosenberg, L. Rabiner, and B. Juang, “A Vector Quantization Approach to Speaker Recognition,” Proc. Int. Conf. Acoustics, Speech, and Signal Processing, vol. 1, pp.387-390, Tampa, FL, 1985 [11] R. J. Mammone, X. Zhang and R. P. Ramachandran, “Robust speaker recognition: A feature based approach,” IEEE Signal Processing Mag., vol. 13, pp.58-71, 1996. [12] Z. X. Yuan, B.L. Xu and C. Z. Yu, “Binary quantization of feature vectors for robust text-independent speaker identification,” IEEE Tran. On Speech and Audio Processing, vol.7, no. 1, Jan. 1990 [13] C. Kermorvant “A comparison of noise reduction techniques for robust speech recognition” IDIAP-RR 99-10 [14] D. A. Reynolds, Member, IEEE, and Richard C. Rose, Member, IEEE “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models.” [15] B.H. Juang L.R. Rabiner, and J.G. Wilpon, “On the Use Bandpass Filtering in Speech Recognition,” IEEE Trans. Acoustics, Speech, And Signal Processing, vol. 35, No.7, pp.947-954, July 1987. [16] 陳明熒,”PC電腦語音辨識實作”,旗標出版,民83,台北市 [17] J.P. Campbell, “Speaker Recognition: A Tutorial, ”Proc. IEEE, vol. 85, no. 9, pp.1437-1462, Sept. 1997 [18] R. Jang (張智星) http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/index.asp Audio Signal Processing and Recognition (音訊處理與辨識) [19] 王小川,”語音訊號處理”,全華出版,2005年2月 [20] G., Ben and N. Morgan. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley and Sons, Inc: New York. 2000. [21] 楊鎮光,” Visual Basic與語音辨識”,松崗出版,pp3-34-36,2002年6月 [22] J.G. Rodriguez J.O. Garcia Cesar Martin and Luis Hernandez “Increasing Robustness In Gmm Speaker Recognition System for Noisy and reverberant Speech with Low Complexity Microphone Arrays” [23] A. Acero and X. Huang “Augmented Cepstral Normalization for Robust Speech Recognition” [24] IMAI S.: Cepstral analysis on the mel frequency scale. –In: Proceedings ICCASSP-83,1983, pp.93-96 [25] Z. Tychtl and J. Psutka, “Speech Production Based on the Mel-Frequency Cepstral Coefficients,” No. VS 97159, and by the Grant gency of the Czech Republic-project No.102/96/K087. [26] S.B. Davies and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. ASSP-28, no. 4, 99. 357-366,Aug. 1980. [27] H. Matsumoto and M. Moroto, “Evaluation of Mel-LPC Cepstrum in A Large Vocabulary Continuous Speech Recognition,” IEEE, pp.117-120,2001.zh_TW
dc.identifier.urihttp://hdl.handle.net/11455/7709-
dc.description.abstract近年來,隨著語音辨識的成熟加上應用的範圍擴大,語者驗證的研究越來越受到重視。本論文提出幾種不同的語音特徵值在應用本特定文字證系統上,我們並且比較不同語音特徵值的效能。首先,我們利用線性預測參數(Linear predictive coding ,LPC)和它的一階倒頻譜(Delta-cepstral coefficients)來做語者驗證。皆下來使用另一個經過二十個三角慮波器的梅爾倒頻譜參數(MFCC)來近似整段語音的特徵值。最後我們合併兩個特徵值參數LPCC和MFCC成為一個新的特徵值,並比較三個特徵值在使用隱藏式馬可夫模型(HMM)驗證系統下的效能。zh_TW
dc.description.abstractIn recent years, speaker verification technique and its applications become extension of the scope and the importance of the study of speaker verification is increasing. In this thesis, we developed a combined feature extraction set and used in place of conventional LPC or MFCC feature only. The Linear Predictive Coding (LPC) and its Delta-cepstral coefficients in voice verification system have shown a good result in speaker verification. The use of Mel-Frequency Cepstral Coefficients (MFCC) that has twenty triangular filters to approximate entire speech features was also been used in speaker verification for many years. The experimental results show using the new LPCC-MFCC combined feature have better performance on text dependent speaker verification system.en_US
dc.description.tableofcontents目 錄 1.第一章 1 緒論 1 1.1 研究動機 1 1.2 語者辨識概論 2 1.3 章節大要 3 2.第二章 4 語音辨識的理論基礎 4 2.1 何謂語音辨識 4 2.2 語者驗證模型 5 3.第三章 8 語音辨識演算法 8 3.1語音的前置處理 8 3.1.1 語音訊號取樣( Sampling ) 8 3.1.2 移除直流偏移( DC-offset Removal ) 8 3.1.3 帶通濾波器( Band Pass Filter ) 9 3.1.4 音框化( Frame Blocking ) 10 3.1.5 端點偵測( Endpoint Detection ) 10 3.1.6 音量量化( Volume Normalization ) 12 3.1.7 預強調( Pre-emphasize ) 12 3.1.8 視窗函數( Windowing ) 13 3.2 特徵參數擷取(Parameter Extraction) 13 3.2.1 線性預估參數( Linear Predictive Coding ) 14 3.2.2 倒頻譜係數( Cepstrum Coefficient ) 15 3.2.3 一階倒頻譜係數( Delta-Cepstrum Coefficient ) 16 3.2.4 通道效應消除( Cepstral Mean Subtraction ) 16 3.2.5 梅爾倒頻參數 18 ( Mel-Frequency Cepstral Coefficients ,MFCC ) 18 3.2.5.1梅爾頻率的三角形濾波器 18 3.2.6 合併LPCC和MFCC特徵值 20 3.3 隱藏式馬可夫模型( Hidden Markov Models ) 20 3.3.1 正算程式( The Forward Procedure ) 22 3.3.2 逆算程式( The Backward Procedure ) 23 3.3.3 維特比演算法( The Viterbi Algorithm ) 24 3.4 語者模型的建立( Speaker Model ) 25 3.5 測試流程 30 4.第四章 32 實驗結果與分析 32 4.1狀態數不同時對系統的影響 34 4.1.1 LPCC參數模擬結果 35 4.1.2 MFCC參數模擬結果 36 4.1.3 LPCC+MFCC參數模擬結果 37 4.2 相異特徵值的模擬結果 38 5.第五章 40 結論與未來展望 40 5.1 結論 40 5.2 未來展望 40 6.參考文獻 41 圖 目 錄 圖 1-1 語者辨識的基本架構 2 圖 2-1 三個硬幣的馬可夫模型 5 圖 2-2 高斯混合模型 6 圖 2-3 動態時軸校準 7 圖 2-4 VQ分類二維空間Codeword 7 圖 3-1 原始訊號和經過帶通濾波後之訊號 9 圖 3-2 音框化 Frame Blocking 10 圖 3-3 越零率 11 圖 3-4 有效語音片段 11 圖 3-5 使用預強調濾波器 12 圖 3-6 漢明視窗 13 圖 3-7 LPCC計算流程圖 14 圖 3-8 倒頻譜係數說明圖 15 圖 3-9 MFCC計算流程圖 18 圖 3-10 梅爾頻率三角波 18 圖 3-11 MFCC中的梅爾頻譜轉換 19 圖 3-12 隱藏式馬可夫模型 21 圖 3-13 正算程式 23 圖 3-14 逆算程式 24 圖 3-15 建立語者模型流程圖 25 圖 3-16 音框與狀態之初始分配模式圖 26 圖 3-17 狀態轉換限制 27 圖 3-18 維特比演算法搜尋最佳路徑圖 27 圖 3-19 均分音框 28 圖 3-20 第一次重新分配音框 28 圖 3-21 第二次重新分配音框 29 圖 3-22 最大相似機率時之音框分配圖 29 表 目 錄 表 1-1 各種生物特徵效能比較 1 表 3-1 去除通道效應CMS說明 17 表 3-2 隱藏式馬可夫模型參數表 21 表 3-3 音框與狀態之分佈機率 30 表 3-4 音框與狀態之分佈累積機率 31 表 4-1 語者資料庫明細 32 表 4-2 說話人確認之條件機率 32 表 4-3 語音信號處理各演算法參數表. 33 表 4-4 在LPCC下不同狀態下比較(男生) 35 表 4-5 在LPCC下不同狀態下比較(女生) 35 表 4-6 在LPCC特徵值下的正確率和錯誤率 35 表 4-7 在MFCC下不同狀態下比較(男生) 36 表 4-8 在MFCC下不同狀態下比較(女生) 36 表 4-9 在MFCC特徵值下的正確率和錯誤率 36 表 4-10 在LPCC+MFCC下不同狀態的比較(男生) 37 表 4-11 在LPCC+MFCC下不同狀態的比較(女生) 37 表 4-12 在LPCC+MFCC特徵值下的正確率和錯誤率 37 表 4-13 不同特徵值的模擬結果(男生) 38 表 4-14 不同特徵值的模擬結果(女生) 39 表 4-15 相異特徵值的模擬結果 39zh_TW
dc.language.isoen_USzh_TW
dc.publisher電機工程學系所zh_TW
dc.relation.urihttp://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0005-2507200715561600en_US
dc.subjectLinear predictive coding(LPC)en_US
dc.subject線性預測參數zh_TW
dc.subjectMel-Frequency Cepstral Coefficients (MFCC)en_US
dc.subjectHidden Markov Model(HMM)en_US
dc.subject梅爾倒頻譜參數zh_TW
dc.subject隱藏式馬可夫模型zh_TW
dc.title雙特徵值應用於特定文字之語者驗證系統zh_TW
dc.titleDouble Feature Extraction for Text Dependent Speaker Verification Systemen_US
dc.typeThesis and Dissertationzh_TW
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.openairetypeThesis and Dissertation-
item.cerifentitytypePublications-
item.fulltextno fulltext-
item.languageiso639-1en_US-
item.grantfulltextnone-
Appears in Collections:電機工程學系所
Show simple item record
 
TAIR Related Article

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.