Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/19895
標題: 一個適用於雲端運算的XML文件多查詢處理機制
A Mechanism for XML Multi-query Processing in Cloud Computing
作者: 翁俊堯
Yoong, Jun-Yao
關鍵字: 雲端運算;MapReduce;XML多查詢;特徵技巧
出版社: 資訊科學與工程學系所
引用: [1]M. Altinel and M. J. Franklin, “Efficient Filtering of XML Documents for Selective Dissemination of Information,” Proceedings of the 26th International Conference on Very Large Data Bases, 2000, pp. 70-94. [2]Y. Chen and Y. Shi, “Tree Inclusion Algorithm, Signatures and Evaluation of Path-Oriented Queries,” Proceedings of the 2006 ACM Symposium on Applied Computing,” 2006, pp. 1020-1025. [3]J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Proceedings of the 6th Symposium on Operating Systems Design and Implementation, 2004, pp. 137-150. [4]Y. Diao and M. J. Franklin, “High-performance XML Filtering: An Overview of YFilter,” IEEE Data Engineering Bulletin, Volume 26, No.1, 2006, pp. 41-48. [5]J. Kwon, P. Ra, B. Moon and S. Lee, “FiST: Scalable XML Document Filtering by Sequencing Twig Patterns,” Proceedings of the 31st International Conference on Very Large Data Bases, 2005, pp. 217-228. [6]S. Park and H. J. Kim, “SigDAQ: An Enhanced XML Query Optimization Technique,” Journal of Systems and Software, 2002, Volume 61,No. 2, pp. 91-103 [7]P. Rao and B. Moon, “PRIX: Indexing And Querying XML Using Prufer Sequences,” Proceedings 20th International Conference on Data Engineering, 2004, pp. 288-300. [8]K.V. Shvachko, “HDFS Scalability: The Limits to Growth,” USENIX, 2010, Volume 35, No.2, pp. 6-16. [9]P. Yuan, C. Sha, X. Wang, B. Yang, A. Zhou and S. Yang, “XML Structural Similarity Search Using MapReduce,” Proceedings of the 11th International Conference on Web-age Information Management,” 2010, pp. 169-181. [10]M. Zhou, H. Hu and M. Zhou, “Searching XML Data by SLCA on a MapReduce Cluster,” Proceedings of 4th International Universal Communication Symposium, 2010, pp. 84-89. [11]D. Zinn, S. Kohler, S. Bowers and B. Ludascher, “Parallelizing XML Processing Pipelines via MapReduce,” Special issue on Scientific Workflows Journal of Computer and System Sciences, 2010, Volume 76, No. 6, pp. 389-508 [12]Apache Hadoop Project (2009), http://hadoop.apache.org/. [13]Bioinformatic Sequence Markup Language (BSML), http://xml.coverpages.org/bsml.html. [14]BIOpolymer Markup Language (BIOML), http://xml.coverpages.org/bioml.html. [15]DBLP, http://www.informatik.uni-trier.de/~ley/db/. [16]Konstantin V. Shvachko, “Scalability of the Hadoop Distributed File System,” http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html. [17]National Center for High-Performance Computing, http://hadoop.nchc.org.tw/. [18]Package javax.xml.xpath, http://download.oracle.com/javase/1.5.0/docs/api/javax/xml/xpath/package-summary.html. [19]The Penn Treebank Project, http://www.cis.upenn.edu/~treebank/. [20]Xmark — An XML Benchmark Project, http://monetdb.cwi.nl/xml/. [21]XMLSpy, http://www.altova.com/xml-editor/.
摘要: 
雲端運算(Cloud Computing)時代已經來臨。雲端系統具有規模龐大的運算能力及儲存空間,並由服務供應商建造大型機房,提供各種軟體應用,讓用戶隨時使用龐大運算能力與最新應用軟體,因此未來不管是軟體服務、資料分析等,都會由雲端運算來執行,用戶端只要透過網路與雲端系統連接,就可使用雲端服務資源。XML是資料交換與資料儲存的標準格式,在雲端上的各種應用軟體服務必須同時服務許多用戶的需求來處理雲端上大量的XML文件,因此處理大量XML文件以及多查詢已成為目前一個相當重要的研究議題。傳統的XML文件多查詢方法大都是針對單一的XML文件做處理,因此傳統的方法不適合使用在雲端環境。
因應這樣的環境,我們在MapReduce程式模型上提出一個運用特徵技巧的XML多查詢處理機制:SigMQ (Signature-based Multi-Query evaluation mechanism),可處理許多使用者對雲端上的大量XML文件的查詢。SigMQ利用MapReduce來平行處理大量的XML文件,並使用特徵技巧來加速MapReduce處理用戶端的查詢。這些作法讓SigMQ在雲端環境上處理大量XML文件及查詢有較好的效能。最後,我們從事一連串的實驗以評估SigMQ的多查詢處理效能;實驗結果顯示,SigMQ比沒有使用特徵技巧的多查詢處理機制在處理大量XML文件及查詢上有較好的查詢效能。
URI: http://hdl.handle.net/11455/19895
Appears in Collections:資訊科學與工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.