Please use this identifier to cite or link to this item:
標題: An Efficient Indexing and Compressing Scheme for XML Query and Update Processing
作者: 廖宜恩
關鍵字: 資訊科學--軟體
XML indexing
Structural summary index
XML Query
XML update operations
摘要: Due to the wide-spread deployment of business-to-business (B2B) E-commerce, XML has become the standard format for data exchange in the Internet. Therefore, how to process queries efficiently on rapidly increasing XML documents is an important research issue. Without efficient indices, XML query languages, such as XQuery and XPath, using label paths to traverse the semi-structured data can be quite time-consuming due to exhaustive traversal on XML data.Various indexing techniques have been proposed in the literature. However, they suffer from some of the following problems in various degrees. First, some indexing methods require huge size of index structures, which could be bigger than the original XML document. Second, some of them require lots of time to build the index in order to minimize the size of index structures. Third, some of them cannot support complex queries efficiently. Finally, efficient and dynamic updates, such as insertion, deletion, and modification, are not supported in most of the published indexing methods.To overcome the aforementioned problems, we propose a novel indexing method which uses hash-based tables and array to build indices. The proposed method not only compresses XML documents with high compression rate but also supports various queries efficiently without accessing the original XML documents. Another important feature of this method is the low cost of update processing. Without index rebuilding, it can support update operations defined in W3C XQuery Update Facility 1.0 by making small changes on the index structures.
XML目前已成為網際網路上資料溝通和資料交換的標準,面對越來越多的文件以XML格式編寫,如何支援使用者快速查詢的需求,已是一個重要的研究課題。文獻上有許多XML索引方法和技術被提出,不過這些索引研究中,或多或少都有以下的缺點:許多索引方法需要很大的索引空間,有些研究在使用特定的資料集時,其索引結構甚至比原始XML文件還大。部份的方法需要花費很長的時間去建構索引,如果每次查詢都要等待索引的建構,並不符合實際所需。有些方法只適用於某些型態的查詢結構,部份查詢、分支查詢、指定內容值的查詢,或更複雜的查詢組合,就無法有效的被處理。更重要的是,大多數的索引技術都不支援文件的更新。本研究計畫提出一個新的索引方法試圖解決上述的問題,我們採用Structural summary index的方法將XML文件壓縮,再使用簡單的資料結構Hash table群聚相同標籤名稱的節點,達到節省空間及快速取得資料的優點。這個索引結構可以取代原來的XML文件,支援各種型態的查詢時,不需要再參考原來的文件。同時在處理W3C所定義的更新運算時,只需要變更小部份的索引結構或內容,花費極小的成本即可達到更新的目的。
其他識別: NSC99-2221-E005-080
Appears in Collections:資訊科學與工程學系所



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.