Please use this identifier to cite or link to this item:
標題: 一個使用整體學習演算法的中文文字蘊涵系統
A Chinese Textual Entailment System Using Ensemble Learning Algorithms
作者: Chung-Lin Tsao
關鍵字: Textual Entailment
Support Vector Machine
Decision Tree
Logistic Regression
引用: [1]Y. Zhang, J. Xu, C. Liu, X. Wang, R. Xu, Q. Chen, X. Wang, Y. Hou and B. Tang, 'ICRC_HITSZ at RITE: Leveraging Multiple Classifiers Voting for Textual Entailment Recognition' in Proceedings of NTCIR-9 Workshop Meeting, December 6-9, 2011, Tokyo, Japan [2]T. H. Chang, Y. C. Hsu, C. W. Chang, Y. C. Hsu, H. C. Chen, 'Textual Entailment Recognition Using Textual Features and SVM' in Proceedings of the Twenty-Fifth Conference on Computational Linguistics and Speech Processing (ROCLING 2013) [3]S. H. Wu, S. S. Yang, H. S. Chiu, L. P. Chen and R. D. Yang, 'Entailment Analysis for Improving Chinese Textual Entailment System' in Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on Pages: 75 – 81,2013 [4]C. Tu, M. Y. Day, 'A Statistical Approach with Syntactic and Semantic Features for Chinese Textual Entailment' in Information Reuse and Integration (IRI), 2012 IEEE 13th International Conference on Pages:59 – 64,2012 [5]ICTCLAS, [6]中央研究院CKIP中文斷詞系統 [7]Stanford Parser: A statistical parser, 2002, Available : [8]WEKA [9]C. J. Lin and B. Y. Hsiao, ' The Description of the NTOU RITE System in NTCIR-9' in Proceedings of NTCIR-9 Workshop Meeting, December 6-9, 2011, Tokyo, Japan [10]C. J. Lin and Y. C. Tu, 'The Description of the NTOU RITE System in NTCIR-10'in Proceedings of the 10th NTCIR Conference, June 18-21, 2013, Tokyo, Japan [11]N. H. Han and L. W. Ku, 'The Yuntech system in NTCIR-9 RITE Task' in Proceedings of the NTCIR-9 Workshop, 2011, pp. 345-348. [12]H. H. Huang, K. C. Chang, J. M. Haver II and H. H. Chen, 'NTU Textual Entailment System for NTCIR 9 RITE Task' in Proceedings of the NTCIR-9 Workshop, 2011, pp.349-352 [13]C. W. Shih, C. Liu, C. W. Lee and W. L. Hsu, 'IASL RITE System at NTCIR-10' in Proceedings of the 10th NTCIR Conference, Tokyo, Japan, 2013.
摘要: Textual Entailment is concerned about the logical relationship between two sentences in natural language processing. The relationship can be divided into four types: forward, bidirectional, contradiction, and independence. In 2004 Recognizing Textual Entailment (RTE) was proposed as a generic task that captures major semantic inference needed across many natural language processing applications. Mathematical solutions to establish textual entailment can be based on the directional property of this relation, by making a comparison between some directional similarity of the texts involved. In this thesis, we propose a system using ensemble learning algorithms, including support vector machines, decision trees, logistic regression, and artificial way to pick out the rules of thumb. Their results are then used to generate prediction. Each approach will then adjust its weight value based on each ballot being correct or incorrect. In the experiments, we use the training data provided by NTCIR-9 RITE. Using the picked feature values, the accuracy of recognizing textual entailment can be increased to 75.77%
文字蘊涵在自然語言處理中是用來研究兩個句子之間的邏輯關係,其關係可分成四個類型:正向、雙向、矛盾、獨立。在2004年RTE被提議作為許多自然語言處理應用的主要語義推斷需要的一項普通任務。建立文字蘊涵的數學解決方案可以根據這種關係的方向性,然而關係的方向所涉及的文字之間的一些相似的比較。 本文提出一個使用整體學習演算法系統,其中包含SVM、決策樹、邏輯回歸三種機器學習的方式與人工挑選出的經驗法則,將四種方式產生的結果投票決定預測,每個方式會依據每次投票的正確與否調整其方法的權重值。實驗部分,我們使用NTCIR-9 RITE所提供的訓練資料,使用挑選過的六個特徵值,名詞數量差、動詞數量差、否定詞數量差、詞修改距離、同義詞、詞序交換,可在識別繁體中文文字蘊涵多類的準確率達到75.77%。
其他識別: U0005-1007201516395500
文章公開時間: 2018-07-15
Appears in Collections:資訊科學與工程學系所



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.