Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/19935
標題: 一個有效率的資源描述框架文件索引方法
An Efficient Indexing Method for Resource Description Framework Document
作者: 林振生
Lin, Cheng-Sheng
關鍵字: 網路資源描述框架
Resource Description Framework
RDF索引
查詢最佳化
RDF Indexing
Query Optimization
出版社: 資訊科學與工程學系所
引用: [1] O. Corby, R. Dieng-Kuntz, C. Faron-Zucker and F. Gandon, “Ontology-based Approximate Query Processing for Searching the Semantic Web with Corese”, INRIA, July 2005. [2] K. Wilkinson. “Jena property table implementation”. In SSWS, 2006. [3] O. Lasilla, R. Swick (eds): Resource Description Framework (RDF) Model and Syntax Specification, http://www.w3.org/TR/REC-rdf-syntax/. [4] Eric Miller, “An Introduction to the Resource Description Framework,”in Bulletin of the American Society for Information Science and Technology, vol. 25, pp. 15-19, 1998. [5] Abadi, Daniel J. and Marcus, Adam and Madden, Samuel R. and Hollenbach, Kate:“Scalable semantic web data management using vertical partitioning VLDB ’07” in the Proceedings of the 33rd international conference on Very large data bases 411–422, 2007. [6] Cathrin Weiss, Panagiotis Karras, and Abraham Bernstein, “Hexastore: sextuple indexing for semantic web data management,” Proc. VLDB Endow., vol. 1, pp. 1008-1019, 2008. [7] George H.L. Fletcher and Peter W. Beck, “Scalable indexing of RDF graphs for efficient join processing,” in the Proceeding of the 18th ACM conference on Information and knowledge management, Hong Kong, China, 2009. [8] Octavian Udrea, Pugliese Udrea, Andrea Pugliese, Subrahmanian, V.S. Subrahmanian, “GRIN: A graph based RDF index, ” in the proceedings of the National Conference on Artificial Intelligence, p. 1465, 2007. [9] Wilkinson, K.; Sayers, C.; Kuno, H. A.; and Reynolds, D. “Efficient RDF Storage and Retrieval in Jena2,” In Semantic Web and Databases Workshop, pp131-150, 2003. [10] W3C,“SGML”, Available: http://www.w3.org/MarkUp/SGML/ [11] W3C,“XML”, Available: http://www.w3.org/XML/ [12] Marja-Riitta Koivunen and Eric Miller ,“W3C Semantic Web Activity”in the Semantic Web Kick-off Seminar in Finland, Nov 2, 2001 [13] Mills Davis,“SEMANTIC SOCIAL COMPUTING”, Available: http://semanticommunity.info/@api/deki/files/7436/=MDavis09202007.pdf [14] DAVID A. HUFFMAN,"A Method for the Construction of Minimum-Redundancy Codes",in the procededings of I.R.E,Sep,1952. [15] Wikipedia, “Huffman Coding”, Available: http://en.wikipedia.org/wiki/Huffman_coding [16] Wikipedia,“B+ tree”, Available: http://en.wikipedia.org/wiki/B%2B_tree [17] W3C, “RDF.” Available: http://www.w3.org/RDF/ [18] W3C,“SPARQL Query Language for RDF”, http://www.w3.org/TR/rdf-sparql-query/ [19] S. A. McIlraith, T. C. Son, and Zeng Honglei, “Semantic Web services,” Intelligent Systems, IEEE, vol. 16, pp. 46-53, 2001. [20] Wikipedia, "File:Semantic-web-stack.png," in http://en.wikipedia.org/wiki/File:Semantic-web-stack.png, Semantic-web-stack.png‎, 2008. [21] Peter Mika, “Distributed indexing for semantic search,” in the Proceedings of the 3rd International Semantic Search Workshop, Raleigh, North Carolina, 2010. [22] Chang Liu, Haofen Wang, Yong Yu, and Linhao Xu, “Towards Efficient SPARQL Query Processing on RDF Data,” Tsinghua Science & Technology, vol. 15, pp. 613-622, 2010. [23] Bastian Quilitz and Ulf Leser, “Querying distributed RDF data sources with SPARQL,” The Semantic Web: Research and Applications, pp. 15, 2008. [24] Sherif Sakr and Ghazi Al-Naymat, “Relational processing of RDF queries: a survey,” SIGMOD Rec., vol. 38, pp. 23-28, 2010. [25] M. F. Husain, L. Khan, M. Kantarcioglu, and B. Thuraisingham, “Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools,” in Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on, pp. 1-10, 2010. [26] Lehigh University,“LUBM Query”,Available: http://swat.cse.lehigh.edu/projects/lubm/query.htm [27] “Unirpot Dataset”, Available : http://www.uniprot.org/ [28] “ChefMoz Dataset”, Available: http://www.dmoz.org/search/?q=restaurants [29] Lehigh University, “LUBM UBA Generator”, Available : http://swat.cse.lehigh.edu/projects/lubm/ [30] Ferhat Yildiz, “HuffmanTree Source Codes”, Available : http://snipd.net/huffman-coding-in-c [31] Lehigh University,“Ontology Sample”, Available: http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl [32] LUBM,“LUBM Test Queries”, Available: http://swat.cse.lehigh.edu/projects/lubm/queries-sparql.txt
摘要: 隨著網路資源的資料量與複雜性逐日增加,讓電腦對於資料分析的處理能力難度越高,因此需要有一個適切的工具,用來描述這些資料的屬性,以及彼此之間的關係,而專門用來描述網路資源的框架文件(Resource Description Framework, RDF)的概念逐漸被廣泛應用。 為了加快RDF文件的查詢,常見的作法為加入索引機制。在目前已知的SPARQL語句查詢與RDF索引研究中,仍無一套節省RDF索引空間與快速回應查詢的整合方案。 本論文提出一個RDF文件的索引與查詢方法,利用霍夫曼編碼無失真的壓縮技術,將RDF文件內的元素根據發生的頻率統計,以二進位數字的表示方法加以編碼,進一步降低索引的大小;在查詢效能的改善方面,我們透過霍夫曼編碼的編碼長度來決定元素發生的頻率,進行資料的篩選,提升Sort Merge Join 運算之效能。 實驗的結果顯示,本論文所提的方法比TripleT方法,在索引大小方面,約能節省76%的空間;在多個查詢語句的速度上約能加快30倍到數萬倍。本論文之方法有效降低索引的大小,更能大幅度的降低查詢時間。
With the rapid increase of resources over the Internet, it is essential to have computer software that can automatically store and exchange machine-readable information distributed throughout the Web. The Resource Description Framework (RDF) that is published by W3C as a recommendation for semantic web in 1999 is the mechanism for describing resources on the Web. This thesis proposes an indexing method for supporting efficient query processing of RDF documents. In the proposed method, the index size was reduced by using Huffman encoding scheme which encodes the RDF elements into binary format depending on frequencies. A new algorithm is also proposed to speed up the process of sort-merge-join operations performed during query processing. The experimental results show that the proposed method can have an average of 76% of the compression ratio for the index size of several RDF documents compared to that of TripleT, and it can speed up query performance ranging from 30 to ten thousand times than TripleT.
URI: http://hdl.handle.net/11455/19935
其他識別: U0005-2307201200344900
文章連結: http://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0005-2307201200344900
Appears in Collections:資訊科學與工程學系所

文件中的檔案:

取得全文請前往華藝線上圖書館



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.