Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/19906
標題: 利用規則轉換學習方法解決台語一詞多音之歧義
A Transformation Based Learning Approach to the Polysemy Problems in a Chinese to Taiwanese TTS System
作者: 楊文德
Yang, Wen-De
關鍵字: 規則轉換學習法
一詞多音
台語文轉音系統
監督式學習
出版社: 資訊科學與工程學系所
引用: [1] Brill, E. 1993. “Automatic grammar induction and parsing free text: A transformation-based approach.” In Proceedings of the 31st Meeting of the Association for Computational Linguistics. Columbus, Ohio, USA, pp. 259-265. [2] Brill, E. 1994. “Some advances in transformation-based part-of-speech tagging.” In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94) , pp. 722-727. [3] Brill, E. (1995). “Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging”, In Proceedings of the Very Large Corpora Workshop. [4] Brill, E.c and Mooney, Raymond J. (1997). “An Overview of Empirical Natural Language Processing”, AI Magazine 18(4):13-24. [5] Brill, E. 1995. “Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging.” Computational Linguistics 21(4):543-565. [6] Brill, E. and P. Resnik. 1994. “A transformational-based approach to prepositional phrase attachment disambiguation.” In Proceedings of the Fifteenth International Conference on Computational Linguistics. Kyoto, Japan , pp. 1198-1204. [7] Brill, E. 1996. “Learning to parse with transformations”, In Recent Advances in Parsing Technology. Kluwer Academic Publishers. [8] Brill, E, “Some advances in transformation-based part of speech tagging”, In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, Wa., 1994. [9] Brill, E. and P. Resnik, “A transformation-based approach to prepositional phrase attachment disambiguation”, In Proceedings of the 15th International Conference on Computational Linguistics, Kyoto, Japan, 1994. [10] C. C. Ho, “A Hybrid Statistical/RNN Approach to Prosody Synthesis for Taiwanese TTS,” Master thesis, Department of Communication Engineering, National Chiao Tung University, 2000. [11] C. H. Hwang, "Text to Pronunciation Conversion in Taiwanese", Master thesis, Institute of Statistics, National Tsing Hua University, 1996. [12] C. Shih and R. Sproat, “Issues in Text-to-Speech Conversion for Mandarin”, Computational Linguistics and Chinese Language Processing, Vol., 1, No. 1, Aug. 1996, pp. 37-86. [13] C. H. Wu, C. C. Hsia, J. F. Chen, and J. F. Wang, ”Variable-Length Unit Selection in TTS Using Structural Syntactic Cost”, IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No.4, pp. 1227-1235, May 2007. [14] Florian and G. Ngai. 2001. “Transformation-based learning in the fast lane”, Technical report, Johns Hopkins University, Computer Science Department. [15] H. Bao, A. Wang, and S. Lu, “A Study of Evaluation Method for Synthetic Mandarin Speech”, in Proceedings of ISCSLP 2002, The Third International Symposium on Chinese Spoken Language Processing, pp. 383-386. [16] J. Y. Huang, “Implementation of Tone Sandhi Rules and Tagger for Taiwanese TTS”, Master thesis, Department of Communication Engineering, National Chiao Tung University, 2001. [17] J. R. Curran and R. K. Wong. “Transformation based learning in document format processing”, In Working notes of the AAAI 1999 Fall Syposium on Using Layout for the Generation, Understanding or Retrieval of Documents, 1999. [18] L. Ramshaw and M. Marcus. “Text chunking using transformation-based learning”, In Proceedings of the Third ACL Workshop on Very Large Corpora, 1995. [19] M. S. Yu, T. Y. Chang, T. H. Hsu, and Y. H. Tsai, "A Mandarin Text-to-Speech System Using Prosodic Hierarchy and a Large Number of Words", Proceedings of the 17th Conference on Computational Linguistics and Speech Processing, (ROCLING XVII), pp. 183-202, Sep. 15-16, 2005, Tainan, Taiwan. [20] Ramshaw, Lance A. and Marcus, Mitchell P. (1994).Exploring the Statistical Derivation of ransformation Rule Sequences for Part-of-Speech Tagging. In Proceedings of the 32nd Annual Meeting of the ACL. [21] S. H. Chen, S. H. Hwang, and Y. R. Wang, “A Mandarin Text-to-Speech System”, Computational Linguistics and Chinese Language Processing, Vol. 1, No. 1, Aug. 1996, pp. 87-100. [22] Samuel, Ken, Carberry, Sandra, and Vijay-Shanker,K. (1998a). “Computing Dialogue Acts from Features with Transformation-Based Learning”, In Applying Machine Learning to Discourse Processing: Papers from the 1998 AAAI Spring Symposium. [23] Samuel, Ken, Carberry, Sandra, and Vijay Shanker, K. (1998b). “Dialogue Act Tagging with Transformation-Based Learning”, In Proceedings of COLING-ACL. [24] Samuel, K. 1998. “Lazy transformation-based learning”, In Proceedings of the 11th International Florida AI Research Symposium. Florida, USA. pp. 235-239. [25] G. Satta, and E. Brill. 1996. “Efficient transformation-based learning”, In Proceedings of 35th ACL, pp. 255-262. [26] Y. J. Lin, M. S. Yu, and C. J. Huang, “The Polysemy Problems, An Important Issue in a Chinese to Taiwanese TTS System”, Proceedings of the 2008 International Congress on Image and Signal Processing , Paper number P1234, May 28-30, Sanya, China, 2008. [27] Y. J. Lin, M. S. Yu, and C. Y. Lin, "Using Chi-Square Automatic Interaction Detector to Solve the Polysemy Problems in a Chinese to Taiwanese TTS System” , in Proceeding of The 8th International Conference on Intelligent Systems Design and Applications (ISDA 2008), December 2008, pp.362-367. [28] Y. J. Lin and M. S. Yu, “An Efficient Mandarin Text-to-Speech System on Time Domain”, IEICE Transactions on Information and Systems, Vol. E81-D, No. 6, June 1998, pp. 545-555. [29] Y. J. Lin, W. S. Ji, M. S. Yu, and S. D. Lee, “Some Important Issues on Text Analysis in a Chinese to Taiwanese TTS Ssystem,” in Proceedings of the 9th IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA2005). [30] Y. J. Lin, and M. S. Yu, “Using Language Models to Solve the Polysemy Problems in a Chinese to Taiwanese TTS System — the Case of “我們”, in Proceedings of The 10th Conference of Artificial Intelligence (International Track of TAAI 2005). [31] Yih-Jeng Lin, Ming-Shing Yu, Chin-Yu Lin, and Yuan-Chun Lin, “A Layered Approach to the Polysemy Problem in a Chinese to Taiwanese TTS System, ”Journal of Information Science and Engineering, Vol. 26, No. 5, Sep 2010. [32] Yih-Jeng Lin, Ming-Shing Yu, and Jhen-Yuan Liao, ”Solving the Polysemy Problem in A Chinese to Taiwanese TTS system by Using Rough Set Theory, ” International Journal of Kansei Information, Vol 2, No. 3, Sep, 2011. [33] Yih-Jeng Lin, Ming-Shing Yu, Wen-De Yang and Chin-Yu Lin, “The Prediction of Pronunciation of Polyphonic Words in a Chinese to Taiwanese TTS System, ” International Journal of Advanced Information Technologies (IJAIT), Vol. 5, No. 1, pp.195-213, 2011. [34] Yih-Jeng Lin, Ming-Shing Yu and Chin-Yu Lin, “A Combined Approach to the Polysemy Problems in a Chinese to Taiwanese TTS System, ”The 7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010) [35] 中央研究院平衡語料庫 [36] 吳佩穎,”以語料庫為基礎之中文文句翻語音系統中合成單元之選取”,國立交通大學電信工程系所碩士論文,2005。 [37] 林立峰,”中文TTS系統語音合成之改進”,國立交通大學電信工程系所碩士論文,2004。 [38] 林元淳,“中文文轉音系統中一詞多音之研究“,中興大學資訊科學研究所碩士論文,2005。 [39] 林義証、余明興、林金玉, “中文轉台語文轉音系統中一詞多音預測”, 第十二屆人工智慧與應用研討會(TAAI 2007), 2007。 [40] 林義証, 余明興, 林金玉, 林元淳, “利用階層方式預測台語文轉音系統中之一詞多音”, 全國計算機會議(2007 National Computer Symposium, NCS), 2007。 [41] 林金玉,”中文轉台語文轉音系統中一詞多音之預測”,國立中興大學資訊科學與工程學系碩士論文,2008。 [42] 林義証、余明興、林金玉、廖盈智,”利用組合式策略來解決台語文轉音系統中一詞多音問題”,2008。 [43] 林義証、余明興、廖盈智、楊文德, “利用語意角色關係來解決台語文轉音系統中詞被分開的發音問題”, 第十四屆人工智慧與應用研討會 (TAAI 2009), 2009年10月, 台中. [44] 林義証、余明興, “Using the Relationship of Semantic Roles to Solve the Problem Of Separated Words in a Chinese to Taiwanese TTS System,”第一屆網路智能與應用研討會(NCWIA2011), 2011年4月, 高雄。 [45] 張唐瑜,”以大量詞彙作為合成單位的中文文轉音系統”,國立中興大學資訊科學與工程學系碩士論文,2005 。 [46] 傅明榮,”中文字轉音系統之文句分析的進一步研究”,國立交通大學電信工程系所碩士論文,2007。 [47] 鍾祥睿,”台語TTS系統之改進”,國立交通大學電信工程系所碩士論文,2002。
摘要: 本論文提出一個利用規則轉換學習法(Transformation-Based Learning, TBL),解決台語文轉音系統中一詞多音的方法。而TBL廣泛被使用於詞性的標記上,是一種典型的監督學習方式。一詞多音的問題在台語文轉音系統中一直是一個急待解決的問題,文獻中也指出台語中的一詞多音問題遠比中文破音字問題來的多且複雜。本論文以規則轉換學習法,經過重複的學習步驟,推衍出針對訓練語料最佳之轉換規則,解決台語一詞多音的問題。 本論文以『你』、『我』、『他』、『上』、『下』、『不』等六個常見詞為實驗標的。實驗結果顯示,本論文提出之方法在解決台語的一詞多音上,針對『你』、『我』、『他』、『上』、『下』、『不』有很好的預測正確率。其中以『他』、『上』、『不』的正確率高於文獻上之方法,正確率分別是94.19%、90.51%和76.84%。
This thesis proposed a transformation based learning approach in solving the polysemy problems in a Chinese to Taiwanese TTS system. TBL is a traditional supervised approach and is widely applied in POS tagging which is a basic task of natural language processing. The polysemy problem is an urgent work in a Chinese to Taiwanese TTS system. Compared with the similar problem, polyphone problem, in a Mandarin TTS system, the polysemy problem is more complex than that in a Mandarin TTS system. This thesis applied TBL with iteratively learning to obtain the proper rules for words with polysemy problems. We focused on six commonly used words with polysemy problem in this thesis. They are “你”(you), “我”(I), “他”(he/she), “上”(up), “下”(down), and “不”(no). Experiment results show that the proposed approach can achieve high precisions in these six words. The precisions are especially high for the words “他”(he/she), “上”(up), and “不”(no). They are 94.19% , 90.51%, and 76.84%.
URI: http://hdl.handle.net/11455/19906
Appears in Collections:資訊科學與工程學系所

文件中的檔案:

取得全文請前往華藝線上圖書館



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.