Please use this identifier to cite or link to this item:
標題: 具有彈性時間限制因素的頻繁子圖探勘
Frequent Subgraph Mining with Flexible Timing Constraints
作者: 張琇婷
Chang, Hsiu-Ting
關鍵字: graph mining
frequent subgraph
出版社: 資訊科學與工程學系所
引用: 中文文獻 [1]. 徐俊傑、余翠芬,漸進式頻繁子圖型樣探勘(IMFG: Incremental Mining of Frequent Subgraph Patterns),國立台灣科技大學資訊管理研究所碩士論文,2005。 [2]. 廖宜恩、蘇雅雯,具有時間限制因素的頻繁子圖探勘(FSMTC: Frequent Subgraph Mining with Timing Constraints),國立中興大學資訊科學研究所碩士論文,2007。 英文文獻 [3]. R.Agrawal and R.Srikant, “Mining Sequential patterns,” In proc. Of the 7th International Conference on Data Engineering ,pp3-14 1995 [4]. H. Y. Ando, L. Dehaspe, W.Luyten,E. V. Craenenbroeck, H. Vandecasteele, L. V. Meervelt, “Discovering h-bonding rules in crystals with inductive logic programming,” Molecular Pharmaceutics,3(6),pp.665-674,December 2006. [5]. C. Borgelt and M. R. Berhold, “Mining molecular fragments: finding relevant substructures of molecules,” In Proc. Of 2002 IEEE International Conference on Data Mining(ICDM),pp.51-58,2002 [6]. M. Cohen, and E. Gudes, “Diagonally subgraphs pattern mining,” In Proc. Of the 9th ACM SIGMOD Workshop on Research issues in data mining and knowledge discovery, pp51-58,2004 [7]. D. Cook, and L. Holder, “Substructure discovery using minimum description length and background knowledge,” Journal of Artificial Intelligence Research, vol.1, pp.231-255, 1994. [8]. L. Dehaspe, H. Toivonen, and R. D. King, “Finding frequent substructures in chemical compounds,” In proc. Of the 4th International Conference on Knowledge Discovery and Data Mining(KDD1998),pp.30-26,199 [9]. L.Dehaspe, and H. Toivonen, “Discovery of frequent datalog patterns,” Data Mining and Knowledge Discovery,Vol.3 no.1,pp7-36,1999 [10]. J. Huan, W. Wang, and J. Prins, “Efficient mining of frequent subgraph in the presence of isomorphism,” In Proc. of the 3rd IEEE International Conference on Data Mining (ICDM), pp.549-552, 2003. [11]. J. Huan, W. Wang, and J. Prins, “Efficient mining of frequent subgraph in the presence of isomorphism,” Technical Reports produced by the Department of Computer Science at the University of North Carolina, 2003. [12]. J. Huan, W. Wang, and J. Prins, “SPIN: Mining maximal frequent subgraphs from graph databases,” In Proc. of the 2004 Conference on Knowledge Discovery and Data Mining (SIGKDD2004), pp.581-586, 2004. [13]. Ricki G. Ingalls, Douglas J. Morrice, “PERT Scheduling With Resources using Qualitative Simulation Graphs,” Simulation Conference Proceedings, 2000.,pp:362 - 370 vol.1 2000 [14]. A. Inokuchi, T. Washio, and H. Motoda, “An apriori based algorithm for mining frequent substructures from graph data,” In Proc. of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, pp.13-23, 2000. [15]. A. Inokuchi,T. Washio, T.Okada, and H.Motoda, “Applying the apriori-based graph mining method to mutagenesis data analysis”, Journal of Computer Aided Chemistry,pp.87-92,2001 [16]. A. Inokuchi, T. Washio, K. Nishimura, and H. Motoda, “A fast algorithm for mining frequent connected subgraphs,” Research Report RT0448, IBM Research,Tokyo Research Laboratory, 2002. [17]. S. Kramer, L. D. Raedt, and C. Helma, “Molecular feature mining in HIV data,” In proc. Of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD2001),pp.136-148,200 [18]. M. Kuramochi and G. Karypis, “Frequent subgraph discovery,” In Proc. of the 2001 International Conference on Data Mining (ICDM2001), pp.313-320, 2001. [19]. M. Kuramochi and G. Karypis, “Discovering frequent geometric subgraphs,” In Proc. of the 2002 International Conference on Data Mining (ICDM2002), pp.258-265, 2002. [20]. M. Kuramochi, and G. Karypis, “Finding frequent patterns in a large sparse graph,” Data Mining and Knowledge Discovery, Volume 11, Number 3, pp.243-271(29), November 2005 [21]. S. Nijssen, and J. N. Kok, “Frequent graph mining and its application to molecular databases,” In Proc. Of the 2004 IEEE conference on Systems, Man and Cybernetics, SMC2004,pp.4571-4577,October 2004. [22]. Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, ”Introduction to Data Mining,” Addison-Wesley, 2006. [23]. X. Yan and J. Han, “gSpan: graph-based substructure pattern mining,” In Proc. of the 2002 International Conference on Data Mining (ICDM), pp.721-724, 2002. [24]. X. Yan and J. Han, “CloseGraph: mining closed frequent graph patterns,” In Proc. of the 9th ACM SIGKDD International Conference on Knowledge discovery and data mining, pp.286-295, 2003. [25]. K. Yoshida, H. Motoda, and N. Indurkhya, “Graphbased induction as a unified learning framework,” Journal of Applied Intelligence, vol.4, no??, pp.297-328, 1994.
摘要: 在本篇論文中我們提出探勘具時間限制的頻繁子圖問題。一般常見的圖形探勘演算法是利用圖形去摸擬真實資料庫中的物件,它們能夠從龐大的圖形資料中找出頻繁子圖並利用此資訊來解決相關重要的問題,例如生物資訊、化學分析、社會網絡流程分析。然而,這些演算法卻無法解決有關時間需求的相關應用,其原因在於這些演算法的假設全建立在無時間限制的資料上。事實上,在一個資料中兩個物件常常需要時間的限制來確立其是否合乎要求,例如建立專案時所用的PERT排程。而為了能夠解決此類問題,我們在圖形上利用了不同的時間定義來摸擬此類情況。 核心製程指的是零組件在生產時之某段製造流程,我們以圖形的概念來解釋核心製程指的便是頻繁子圖。本篇論文在頻繁子圖探勘時加上了時間限制的因素,來加強在具有時戳性質的圖形資料集裡作圖形探勘,以打破目前傳統頻繁子圖演算法中未考量時間限制因素的子圖探勘。 在本論文中提出了區間間隙的概念,主要即是由固定的Min_Gap、Max_gap將之修改調整為彈性的Min_Gap與Max_gap,此區間間隙彈性的地方即在於圖形資料集裡的任兩個節點所串連的邊皆會有不同的區間間隙,也就是一個具有n個邊的圖形裡,將會有n個不同的區間間隙值,也就是Min_Gap與Max_gap,這樣的作法對於具有時戳性質的圖形探勘將更有效益
In this paper we introduce a problem of discovering frequent subgraph with flexible time constraints. Current graph mining algorithms use graphs to model objects in a real data set. They can discover frequent subgraphs from a large graph data and be used for solving various important problems, such as bioinformatics, cheminformatics, social network flow analysis. Nevertheless, these approaches may not satisfy the time requirement of problems because they assume that the mining is performed in graph data without time constraints. In reality, a data set often needs to specify time gaps between two adjacent objects to qualify itself for the term of validity. For example, the PERT scheduling with resources problem. Thus, we use various definitions about time constraints in the graph data for simulating the time restriction of objects in the real data and present FSMTC algorithm to mine frequent subgraphs from the graph data. The core process can be modeled by a frequent sub-graph of the graph concept, where the core process denotes some procedures of the assembly production. In the paper, we add the factor of timing constraints into the problem of frequent sub-graph mining to enhance the traditional graph mining research. We proposed the interval-gap concept which can flexibly modify time constraints (Min_Gap/Max_gap) in a graph data set. It allows each edge of a graph data set to be with different interval-gap value. For example, given a graph with n edges, the interval-gap concept provides n interval-gap values, Min_Gap and Max_gap, for these edges correspondingly. By incorporating this concept into the graph mining, discovering frequent sub-graph with time constraints can be more effective.
其他識別: U0005-0108200809395400
Appears in Collections:資訊科學與工程學系所



Show full item record
TAIR Related Article

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.