Please use this identifier to cite or link to this item:
標題: 一個可有效支援XML圖形結構連接的彈性編碼機制
A Flexible Labeling Scheme for Efficient Structural Join Over XML Graph
作者: 周健平 
Chou, Chien-Ping 
關鍵字: XML;XML 文件;Structural join;Labeling scheme;結構連接;編碼系統
出版社: 資訊科學研究所
查詢 XML 文件最重要的一個操作,是找出標籤元素之間所存在的結構關係。執行這樣的結構連接(structural join)需要先將 XML 文件當中的節點作編碼。一旦每個節點都按照以區間為基底(interval-based)的系統來完成編碼,則節點之間的結構關係即可以在常數時間內很容易地決定出來。然而本研究指出傳統的方法皆僅能用於樹狀結構的資料,並不適用於任意圖形結構的XML 文件。
本論文的主要貢獻在於提出一種新的區間基底編碼系統,易於圖形結構的 XML 資料中配對出結構關係。根據 e-node 的概念,每一個 XML 節點都按照演算法 G_encoding 被適當地編碼(選擇性地帶著一個E-List),目的是在解決多重繼承(multiple-inheritance)以及迴圈路徑(cyclic-paths)的問題。我們對編碼系統的特性作做了正式的分析,並開發出一個有效率的結構連接演算法 SJG。新的編碼系統亦可以容易地適用於現有的索引技術來加速演算法 SJG 當中的結構關係比對。針對XML文件增加節點而不需要對原有的節點做重新編碼的問題上,我們進一步提出了一個新穎的編碼系統與演算法。

The most important operation of querying XML documents is finding occurrences of structural relationships between tagged elements. Performing such structural joins needs to encode nodes in XML documents. Once each node is labeled by the interval-based labeling scheme, structural relationships between nodes can be determined easily in constant time. However this study shows that traditional approaches work only for tree-structured data and cannot be applied to XML documents being modeled as arbitrary graphs.
The main contribution of this thesis is the solution of matching structural relationships in graph-structured XML data, under a new interval-based numbering scheme. Based on the idea of e-nodes, each XML node can be labeled (with an optional E-List) appropriately by the G_encoding algorithm for solving the multiple-inheritance and cyclic-path problems. The properties of our labeling scheme are formally analyzed. We also develop an efficient structural join algorithm SJG using the labeling scheme. The new labeling scheme can also be easily adapted to existing index techniques to speed up matching structural relationships in Algorithm SJG. For the problem of inserting nodes at arbitrary positions in XML data without renumbering the existing nodes, we additionally propose a novel labeling scheme and algorithm.
Appears in Collections:資訊科學與工程學系所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.