Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/19264
標題: A Top-Down Mining Algorithm for Discovering Hybrid Sequential Patterns
作者: 林克仲 
Lin, Ke Chung 
關鍵字: hybrid sequential patterns;data mining;database
出版社: 資訊科學研究所
摘要: 
In this thesis we propose a new top-down algorithm to discover sequential patterns from large databases. The sequential patterns being mined are called hybrid sequential patterns, which display not only the path but also the relationship among itemsets. Most mining algorithms proposed adopt the bottom-up paradigm, which unfortunately, requires to scan the database multiple times due to the need of verifying candidate patterns. In order to reduce the cost of repeated database scans, we propose a new mining algorithm to find all frequent hybrid patterns by examining and decomposing patterns in transactions in a top-down fashion. Moreover, it only processes the frequent items in transactions and thus reduces the complexity of transaction decomposition. The distinctive feature of the new algorithm is that it needs only to scan the database twice and it proposes a different counting method. Similar to hierarchical architecture, it counts a new pattern defined by us to instead of counting a group of patterns. Thus, it can speed up the mining process in our new algorithm. Experimental results show that our algorithm outperforms the GFP2 on mining the database with finite items. Besides, as the size of the database increases, TDM outperforms GFP2's. With the efficiency, the presented algorithm can be used for different applications that need high performance.

In this thesis we propose a new top-down algorithm to discover sequential patterns from large databases. The sequential patterns being mined are called hybrid sequential patterns, which display not only the path but also the relationship among itemsets. Most mining algorithms proposed adopt the bottom-up paradigm, which unfortunately, requires to scan the database multiple times due to the need of verifying candidate patterns. In order to reduce the cost of repeated database scans, we propose a new mining algorithm to find all frequent hybrid patterns by examining and decomposing patterns in transactions in a top-down fashion. Moreover, it only processes the frequent items in transactions and thus reduces the complexity of transaction decomposition. The distinctive feature of the new algorithm is that it needs only to scan the database twice and it proposes a different counting method. Similar to hierarchical architecture, it counts a new pattern defined by us to instead of counting a group of patterns. Thus, it can speed up the mining process in our new algorithm. Experimental results show that our algorithm outperforms the GFP2 on mining the database with finite items. Besides, as the size of the database increases, TDM outperforms GFP2's. With the efficiency, the presented algorithm can be used for different applications that need high performance.
URI: http://hdl.handle.net/11455/19264
Appears in Collections:資訊科學與工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.