Please use this identifier to cite or link to this item:
標題: 使用兩階段機器學習的自然語言查詢系統
A Natural Language Query System base on two steps' Machine Learning
作者: 歐庭銨
Ting-An Ou
關鍵字: 自然語言查詢系統;實體比對;查詢語法產生;機器學習;Natural Language Query;Entity Mapping;SPARQL Generation;Machine Learning
引用: [1] Adwait, R. (1996). A maximum entropy part-of-speech tagger. Proceedings of the Empirical Methods in Natural Language Processing Conference. [2] Alexopoulos, P., et al. (2014). Towards ontology-based question answering in vague domains. Semantic and Social Media Adaptation and Personalization (SMAP), 2014 9th International Workshop on, IEEE. [3] Bizer, C., et al. (2009). 'DBpedia-A crystallization point for the Web of Data.' 7(3): 154-165. [4] Bollacker, K., et al. (2008). Freebase: a collaboratively created graph database for structuring human knowledge. Proceedings of the 2008 ACM SIGMOD international conference on Management of data, AcM. [5] Cabrio, E., et al. (2012). QAKiS: an open domain QA system based on relational patterns. International Semantic Web Conference, ISWC 2012. [6] Collins, M. (2002). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, Association for Computational Linguistics. [7] Diefenbach, D., et al. (2017). Wdaqua-core0: A question answering component for the research community. Semantic Web Evaluation Challenge, Springer. [8] Dima, C. (2013). Intui2: A Prototype System for Question Answering over Linked Data. CLEF (Working Notes). [9] Dima, C. (2014). Answering Natural Language Questions with Intui3. CLEF (Working Notes). [10] Ding, L., et al. (2004). Swoogle: a search and metadata engine for the semantic web. Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM. [11] Ferrucci, D., et al. (2010). 'Building Watson: An overview of the DeepQA project.' 31(3): 59-79. [12] Freitas, A. and E. Curry (2014). Natural language queries over heterogeneous linked data graphs: A distributional-compositional semantics approach. Proceedings of the 19th international conference on Intelligent User Interfaces, ACM. [13] Freitas, A., et al. (2015). 'Approximate and selective reasoning on knowledge graphs: A distributional semantics approach.' 100: 211-225. [14] Freitas, A., et al. (2011). Querying linked data using semantic relatedness: a vocabulary independent approach. International Conference on Application of Natural Language to Information Systems, Springer. [15] Gabrilovich, E. and S. Markovitch (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. IJcAI. [16] Gerber, D. and A.-C. N. Ngomo (2012). Extracting multilingual natural-language patterns for rdf predicates. International Conference on Knowledge Engineering and Knowledge Management, Springer. [17] Goldberg, Y. and O. J. a. p. a. Levy (2014). 'word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method.' [18] Han, S., et al. (2015). Keyword question answering system with report generation for linked data. Big Data and Smart Computing (BigComp), 2015 International Conference on, IEEE. [19] He, S., et al. (2014). Question answering over linked data using first-order logic. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). [20] He, S., et al. (2013). CASIA@ QALD-3: A Question Answering System over Linked Data. CLEF (Working Notes). [21] Höffner, K., et al. (2017). 'Survey on challenges of question answering in the semantic web.' 8(6): 895-920. [22] Kuo, H.-Y and J.-L. Lu (2017). 'A Natural Language Querying System Based on Semantic Parsing.' [23] Lee, H.-Y. and J.-L. Lu (2016). 'Question Answering System Based on DBpedia to Answering and Creating new Realtion.' [24] Lehmann, J., et al. (2015). 'DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia.' 6(2): 167-195. [25] Li, H., et al. (2017). Multimodal question answering over structured data with ambiguous entities. Proceedings of the 26th International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee. [26] Lopez, V., et al. (2012). 'Poweraqua: Supporting users in querying and exploring the semantic web.' 3(3): 249-265. [27] Manning, C., et al. (2014). The Stanford CoreNLP natural language processing toolkit. Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. [28] Mendes, P. N., et al. (2011). DBpedia spotlight: shedding light on the web of documents. Proceedings of the 7th international conference on semantic systems, ACM. [29] Miller, G. (1998). WordNet: An electronic lexical database, MIT press. [30] Nakashole, N., et al. (2012). PATTY: a taxonomy of relational patterns with semantic types. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics. [31] Suchanek, F. M., et al. (2008). 'Yago: A large ontology from wikipedia and wordnet.' 6(3): 203-217. [32] Toutanova, K., et al. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, Association for Computational Linguistics. [33] Tran, P. N. and D. T. Nguyen (2015). Mapping expansion of natural language entities to DBpedia's components for querying linked data. Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication, ACM. [34] Unger, C., et al. (2012). Template-based question answering over RDF data. Proceedings of the 21st international conference on World Wide Web, ACM. [35] Unger, C., et al. (2014). An introduction to question answering over linked data. Reasoning Web International Summer School, Springer. [36] Usbeck, R., et al. (2015). HAWK–hybrid question answering using linked data. European Semantic Web Conference, Springer. [37] Usbeck, R., et al. (2017). 7th Open Challenge on Question Answering over Linked Data (QALD-7). Semantic Web Evaluation Challenge, Springer. [38] Walker, A. D., et al. (2015). Answer Type Identification for Question Answering. Joint International Semantic Technology Conference, Springer. [39] Wendt, M., et al. (2014). Ask Like an Egyptian: Question Answering in the Alexandria Use Case. Towards the Internet of Services: The THESEUS Research Program, Springer: 299-314. [40] Xu, K., et al. (2014). Answering natural language questions via phrasal semantic parsing. Natural Language Processing and Chinese Computing, Springer: 333-344. [41] Yahya, M., et al. (2012). Natural language questions for the web of data. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics. [42] Zhu, C., et al. (2015). A Graph Traversal Based Approach to Answer Non-Aggregation Questions Over DBpedia. Joint International Semantic Technology Conference, Springer. [43] Zou, L., et al. (2014). Natural language question answering over RDF: a graph data driven approach. Proceedings of the 2014 ACM SIGMOD international conference on Management of data, ACM.

其中,實體比對的傳統做法為根據詞性進行不同實體的比對或是將問句轉為圖形後根據lexical rule進行比對,而查詢語法產生的傳統做法為根據相依關係樹、語法結構樹等圖形產生。我們使用了MEMM進行了兩階段的訓練,試圖在這兩階段的訓練中分別解決實體比對以及產生查詢語法這兩個步驟中的問題。

本研究使用了Question Answering over Linked Date(QALD)大會的QALD-7的多語言訓練集進行訓練,並使用其測試集進行測驗。在將我們的系統在與其他同樣使用QALD-7多語言測試集的研究進行比較後,比較結果證實了我們的系統成效十分不錯,Precision、Recall以及F1-measure都達到了68%,是所有以QALD-7多語言測試集進行測試的英文查詢系統中成效中最佳的。

With the rapid developed of artificial intelligence,to use these artificial intelligences smartly,there are more and more researcher are finding a way to let machine to understand natural language,thus the research in Natural Language Query System becomes a hot research field.A Natural Language Query System will encounter different problems in each search steps,for example,in Entity Mapping step we need to find out the word's corresponding entity,and in SPARQL Generation step we need to find out the realationship between every two entities.

Among the two steps,Entity Mapping's traditional solutuin is accroding the part-of-speech mapping to different entities or transfer the question to DAG and mapping entities by lexical rule.SPARQL Generation's traditional solutuin is accroding to parse tree or dependency tree to generate SPARQL query.We use MEMM to train our two step's training,try to solve Entity Mapping step and SPARQL Generation step's problems.

In our research we use Question Answering over Linked Date's qald-7's multilingual dataset to train our model,and test our system.After the test,the result show that our system get the best Precision、Recall and F1-measure in the qald-7's multilingual dataset in English.
Rights: 同意授權瀏覽/列印電子全文服務,2021-08-29起公開。
Appears in Collections:資訊管理學系

Files in This Item:
File SizeFormat Existing users please Login
nchu-107-7105029025-1.pdf1.11 MBAdobe PDFThis file is only available in the university internal network    Request a copy
Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.