Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/19781
標題: 資料串流上序列樣式變化探勘之研究
Mining Changes in Sequential Patterns of Data Streams
作者: 李怡慧
Li, I-Hui
關鍵字: 資料串流
序列樣式探勘
變化樣式
預測
變化處理
出版社: 資訊科學與工程學系所
引用: [1] C.Y. Tsai, P.H. Lo, A Sequential Pattern Based Route Suggestion System, International Journal of Innovative Computing, Information and Control, Vol. 6, No. 10, 2010, pp. 4389-4408. [2] M.C. Chen, A.L. Chiu, H.H. Chang, Mining changes in customer behavior in retail marketing, Expert Systems with Applications, Vol. 28, 2005, pp. 773-781. [3] C.C. Ho, H.F. Li, F.F. Kuo, S.Y. Lee, Incremental Mining of Sequential Patterns over a Stream Sliding Window. Proceedings of 6th IEEE International Conference on Data Mining, 2006, pp. 677-681. [4] H. Li, H. Chen, GraSeq: A novel approximate mining approach of sequential patterns over data stream. Proceedings of 3rd International Conference on Advanced Data Mining and Applications (ADMA), LNAI 4632, 2007, pp. 401-411. [5] J.H. Chang, W.S. Lee, Efficient Mining method for Retrieving Sequential Patterns over Online Data Streams, Journal of Information Science, Vol. 31, No.5, 2005, pp. 31-36. [6] J. Han, K. Micheline, Data Mining: Concepts and Techniques, Morgan Kauffman Publishers, 2nd edition, 2007. [7] R. Agrawal, R. Srikant, Mining Sequential Patterns, Proceedings of the 11th International Conference on Data Engineering, 1995, pp. 3-14. [8] R. Agrawal, R. Srikant, Mining sequential patterns: Generalizations and performance improvements, Proceedings of 5th International Conference on Extending Database Technology, 1996, pp. 3-17. [9] J. Pei, J. Han, B. Mortazavi-Asl, J.Wang, H. Pinto, Q. Chen, U. Dayal, M.C. Hsu, Mining sequential patterns by pattern-growth: The PrefixSpan approach, IEEE Transaction Knowledge Data Engineering, Vol. 16, No. 11, 2004, pp. 1424-1440. [10] C. Ra¨ıssi, P. Poncelet, M. Teisseire, SPEED: Mining Maximal Sequential Patterns over Data Streams. Proceedings of 3rd International Conference on Intelligent System, 2006, pp. 1-8. [11] H.F. Li, S.Y. Lee, M.K. Shan, DSM-PLW: Single-Pass Mining of Path Traversal Patterns over Streaming Web Click-Sequences, Computer Networks: Special Issue on Web Dynamics, Vol. 50, No. 10, 2006, pp. 1474-1487. [12] G. Chen, X. Wu, X. Zhu, Mining sequential patterns across data streams, Computer Science Technical Report CS-05-04, University of Vermont, 2005. [13] G. Chen, X. Wu, X. Zhu, Sequential Pattern Mining in Multiple Streams, Proceedings of IEEE International Conference on Data Mining (ICDM), 2005, pp. 585-588. [14] H.S. Kim, J.J. Shin, Y.I. Jang, G.B. Kim, H.Y. Bae, RSP-DS: Real Time Sequential Pattern Analysis over Data Streams, K.C. Chang et al. (Eds.): APWeb/WAIM 2007 Ws, Lecture Notes in Computer Science (LNCS), Vol. 4537, Springer Verlag, 2007, pp. 99-110. [15] L.F. Mendes, B. Ding, J. Han, Stream Sequential Pattern Mining with Precise Error Bounds, Proceedings of IEEE International Conference on Data Mining (ICDM), 2008, pp. 941-946. [16] B. Liu, W. Hsu, H. Han, Y. Xia, Mining Changes for Real-Life Applications. Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery, Lecture Notes in Computer Science (LNCS), Vol. 1874, Springer Verlag, 2000, pp. 337-346. [17] C.Y. Tsai, Y.C. Shieh, A change detection method for sequential patterns, Decision Support Systems, Vol. 46, No. 2, 2009, pp. 501-511. [18] Y.L. Chen, Y.H. Hu, Constraint-based sequential pattern mining: The consideration of recency and compactness, Decision Support Systems, Vol. 42, No. 2, 2006, pp. 1203-1215. [19] V. Ganti, J. Gehrke, R. Ramakrishnan, A framework for measuring changes in data characteristics, Proceedings of the 18th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS), 1999, pp. 126-137. [20] J. Gama, P. Medas, G. Castillo, P. Rodrigues, Learning with drift detection, In Advances in Artificial Intelligence - SBIA, Lecture Notes in Computer Science (LNCS), Vol. 3171, Springer Verlag, 2004, pp. 286-295. [21] M. Baena-Garcia, J.del Campo-Avila, R. Fidalgo, A. Bifet, R. Gavalda, R. Morales-Bueno, Early Drift Detection Method, ECML PKDD 2006 Workshop on Knowledge Discovery from Data Streams, 2006, Berlin, Germany. [22] G. Hulten, L. Spencer, P. Domingos. Mining time-changing data streams. Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2001, pp. 97-106. [23] H. Wang, W. Fan, P.S. Yu, J. Han, Mining concept drifting data streams using ensemble classifiers, Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2003, pp. 226-235. [24] J.H. Chang, W.S. Lee, Finding recent frequent itemsets adaptively over online data streams, Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2003, pp. 487-492.J.H. Chang, W.S. Lee, estWin: online data stream mining of recent frequent itemsets by sliding window method, Journal of Information Science, Vol. 31, No. 2, 2005, pp. 76-90. [26] R. Agarwal, R. Srikant, Fast algorithms for mining association rules, Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), 1994, pp. 487-499. [27] H.F. Li, S.Y. Lee, M.K. Shan, Online Mining Changes of Items over Continuous Append-only and Dynamic Data Streams, Journal of Universal Computer Science, Vol. 11, No. 8, 2005, pp. 1411-1425. [28] H.F. Li, M.K. Shan, S.Y. Lee, Detecting Changes in User-Centered Music Query Streams, Proceedings of IEEE International Conference on Multimedia and Expo (ICME), 2006, pp. 1977-1980. [29] D.R. Liu, M.J. Shih, C.J. Liau, C.H. Lai, Mining the change of event trends for decision support in environmental scanning, Expert Systems with Applications, Vol. 36, 2009, pp. 972-984. [30] H. Cheng, X. Yan, J. Han, IncSpan: Incremental Mining of Sequential Patterns in Large Database. Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2004, pp. 527-532. [31] P. Vorburger, A. Bernstein, Entropy-based Concept Shift Detection, Proceedings of the 6th International Conference on Data Mining (ICDM), 2006, pp. 1113-1118. [32] Data generator (http://www.almaden.ibm.com/cs/quest) of IBM.
摘要: Many techniques have been proposed for mining the sequential patterns in data streams. However, most of these techniques do not consider the change characteristics of these patterns over time, or use only a simple static decay function to assign a greater importance to the more recent data in the streams. Nonetheless, the change phenomenon is a kernel issue in data streams, and as time goes by, decision-makers require the ability to identify and predict changes in the sequential patterns over data streams in order to respond to emerging trends in a timely and appropriate manner. Accordingly, this study proposes a new, adaptive model for mining the change in the sequential patterns of data streams. In the proposed approach, the current mining results for the sequential patterns within the data streams are merged with the previous mining results, and the significant change patterns and corresponding degree of change are identified. Then, the degree of change between the current sequential patterns and those in the next mining round is predicted, and the decay rate modified accordingly. Moreover, the corresponding degree of support change of sequential patterns is defined, and a predictor is proposed for predicting changes of pattern types in accordance with their degree of support change. To the best of the authors' knowledge, the model proposed in this study represents the first reported attempt to predict the change phenomenon of sequential patterns in real-world data streams to adapt the sensitivity of the mining model in response. The experimental results confirm the ability of the proposed model to detect the significant change patterns within data streams and to automatically tune the decay rate in accordance with the predicted degree of change in the following mining round and the present state of the data streams. Additionally, the pattern type prediction performance of the proposed model is better than that of two linear regression-based models. As a result, the proposed model provides decision-makers with an effective means of detecting emerging trends within real-world applications such as wireless sensor networks (WSNs) and web logs such that appropriate actions can be formulated in response.
URI: http://hdl.handle.net/11455/19781
Appears in Collections:資訊科學與工程學系所

文件中的檔案:

取得全文請前往華藝線上圖書館



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.