Please use this identifier to cite or link to this item:
A Sequence Data Classification Based on Sequential Pattern Length
|引用:|| N. Lesh, M. J. Zaki, and M. Ogihara, “Scalable feature mining for sequential data,” IEEE Intelligent Systems 15, pp. 48-56, 2000.  K. Wang, Y. Hu and J. H. Yu, “Scalable sequential pattern mining for biological sequences,” Proceedings of 13th ACM Conference on Information and Knowledge Management, 2005.  曾憲雄, 蔡秀滿, 蘇東興, 曾秋蓉, 王慶堯, “資料探勘, ” 旗標出版社, 2008.  L. Rabiner, “A tutorial on hidden Markov models and selected application in speech recognition,” Proceedings of IEEE 77, pp. 257-286 , 1989.  O. Yakhnenko, A. Silvescu, and V. Honvar, “Discriminatively trained Markov model for sequence classification,” Proceedings of 5th IEEE International Conference on Data Mining, pp. 498-505, 2005.  P. N. Tan, M. Steinbach, and V. Kumar, “Introduction to Data Mining, ” 培生教育出版集團, 2008.  R. Agrawal and R. Srikant, “Mining sequential patterns,” Proceedings of the Eleventh International Conference on Data Engineering, pp. 3-14 , 1995.  J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M.C. Hsu, “Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach,” IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 11, pp. 1424-1440, 2004.  I. Steinwart, D. Hush, and C. Scovel, “A Classification Framework for Anomaly Detection,” Machine Learning Research 6, pp. 211-232, 2005.  G. Fernandes, and P. F. Owezarski, “Automated Classification of Network Traffic Anomalies,” Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, Vol. 19, pp. 91-100, 2009.  E. Tuzun, and J. Dalmau, “Limbic Encephalitis and Variants: Classification, Diagnosis and Treatment,” The neurologist, Vol.13, No. 5, pp. 261-271, 2007.  Y. Zhao, H. Zhang, S. Wu, J. Pei, L. Cao, C. Zhang, and H. Bohlsheid, “Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns,” Lecture Notes in Computer Science, Vol. 5782, pp. 648-663, 2009.  Z. Syed, J. Guttag and P. Indyk, “Learning Approximate Sequential Patterns for Classification,” Machine Learning Research 10, pp. 1913-1936, 2009.  Y. Peng, Z. Wu, and J. Jiang, “A novel feature selection approach for biomedical data classification,” Biomedical Informatics 43, pp. 15-23, 2010.  B. Liu, W. Hsu, and Y. Ma, “Integrating classification and association rule mining,” Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 80-86, 1998.  N. Lesh, M. J. Zaki, and M. Ogihara, “Mining features for Sequence Classification,” Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), pp. 242-246, 1999.  Y. Yang, L. Cao, and L. Liu, “Time-Sensitive Feature Mining for Temporal Sequence Classification” Lecture Notes in Computer Science 2010, Vol. 6230, pp. 315-326, 2010.  V. S. Tseng, and C. H. Lee, “CBS: A New Classification Method by Using Sequential Patterns,”Proceedings of SIAM International Conference on Data Mining, pp. 596-600, 2005.  V. S. Tseng, and C. H. Lee, “Effective temporal data classification by integrating sequential pattern mining and probabilistic induction,” Expert Systems with Applications 36, pp. 9524-9532, 2009.  T. P. Exarchos, M. G. Tsipouras, C. Papaloukas, and D. I. Fotiadis, “A two-stage methodology for sequence classification based on sequential pattern mining and optimization,” Data & Knowledge Engineering 66, pp. 467-487, 2008.  R. Agrawal, and R. Srikant, “Fast algorithms for mining association rules,” Proceedings of 20th Int. Conf. Very Large Data Bases, pp. 487-499 , 1994.  “IBM Quest Market-Basket Synthetic Data Generator,” http://www.cs.rpi.edu/~zaki/software/IBM-datagen.tar.gz; http://dmlab.cs.nchu.edu.tw/modules/wfdownloads/visit.php?cid=5&lid=9  林宇健, “資料探勘技術應用於慢性疾病健康照護管理系統,” 碩士論文, 靜宜大學資訊管理學系, 2008.  P. Pereira, F. silva, and N. A. Fonseca, “BIORED – A Genetic Algorithm for Pattern Detection in Biosequences,” IWPACBB 2008, pp. 156-165 , 2009.  W. Liu, and L. Chen, “An Efficient and Fast Algorithm for Mining Frequent Patterns on Multiple Biosequences,” IFIP Advances in Information and Communication Technology 2011, Vol. 344, pp. 178-194 , 2011.|
The technique of classification can classify data into different categories. With the development of information technology, the demand for sequence data classification increases. Many interesting applications involve decision prediction based on the sequence data. A sequence is an ordered list of elements. The traditional classification methods are not suitable for sequence data. Therefore, this thesis proposed a sequence data classifier model based on the sequential patterns' length. In addition to integrating sequential pattern mining and classification techniques, this study also proposed a classification rule selection mechanism, that predicts the class label of sequence data based on pattern scores. From the experimental results, the proposed sequence data classifier model shows good performance on the synthetic and real sequence data.
|Appears in Collections:||資訊科學與工程學系所|
Show full item record
TAIR Related Article
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.