Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/7783
標題: 以最鄰近相似度為基礎的穩健分群方法
A Nearest Neighbor Similarity based Method for Robust Clustering
作者: 卓俊佑
Cho, Chun-You
關鍵字: cluster analysis
群集分析
nearest neighbor similarities
clustering
最鄰近相似度
分群
出版社: 電機工程學系所
引用: 參考文獻 [1] M. S. Yang and K. L. Wu, “A Similarity-Based Robust clustering Method,” IEEE Trans. Pattern analysis and Machine Intelligence, vol. 26, no. 4, pp. 434-448,2004. [2] J. B. McQueen, “Some Methods of Classification and Analysis of Multivariate Observations.” Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pages 281-297, 1967. [3] E. W. Forgy, “Cluster analysis of multivariate data: efficiency vs. interpretability of classifications,” Biometrics, vol. 21, pp.768-769, 1993. [4] L. Kaufman and P. J. Rousseeuw, “Finding groups in data: an Introduction to cluster analysis.” John Wiley & Sons, 1990. [5] Raymond T. Ng and Jiawei Han, “Efficient and effective clustering methods for spatial data mining.” Proceedings of the 20th VLDB Conference, pages 144-155, Santiago, Chile, 1994. [6] J. C. Bezdek, Pattern recognition with fuzzy objective Function Algorithms. New York: Plenum, 1981. [7] R. Krishnapuram and J. M. Keller, A possibilistic approach to clustering, IEEE Transactions on Fuzzy Systems, 1 (May, 1993). [8] N. R. Pal, K. Pal, J. M. Keller, and James C. Bezdek, “A possibilistic Fuzzy c-Means Coustering Algorrithm”IEEE Trans. On Fuzzy Systems. 2005. [9] J. DeRisi, V. R. Iyer and P. O. Brown, “Exploring the metabolic and genetic control of gene expression on a genomic scale.” Science 278, 680-686, 1997. [10] Martin Ester, Hans-Peter Kriegel, Jorg Sander and Xiaowei Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise.” Proceedings of the 2nd International Conference on KnowledgeDiscovery and Data Mining, pages 226-231, Portland, Orgon, 1996. [11] Brent Ewing and Phil Green, “Analysis of expressed sequence tags indicates 35,000 human genes.” Nature Genetics 25, 232-234, 2000. [12] Doug Fisher, “Improving Inference through Conceptual Clustering.” Proceedings of 1987 AAAI Conferences, pages 461-465, Seattle,WA, July,1987. [13] S. P. A. Fodor, J. L. Read, M. C. Pirrung, L Stryer, A. T. Lu and D. Solas,“Light-directed, spatially addressable parallel chemical synthesis.” Science,251, 767-773,1991. [14] S. P. A. Fodor, R. P. Rava, X. C. Huang, A. C. Pease, C. P. Holmes, C. L.Adams, “Multiplexed biochemical assays with biological chips.” Nature 364,555-556, 1993. [15] Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. 226-231,KDD,1996. [16] A. Hinneburg and D. A. Keim, An efficient approach to Clustering in Large multimedia Databases with Noise. KDD98 pages 58-65, 1998. [17] C. Aggarwal, C. Procopiuc, J.L. Wolf, and P.S. Yu, “Fast Algorithms for Projected Clustering,” Proc. ACM SIGMOD, 1999. [18] W. Wang, Yang, R. Muntz, STING: A Statistical Information grid Approach to Spatial Data Mining, VLDB’97. [19] Gholamhosein Sheikholeslami, Surojit Chatterjee, and Aidong Zhang,“WaveCluster: A multi-resolution clustering approach for very large spatial databases.” Proceedings of the 24 th Very Large Databases Conference(VLDB 98), pages 428— 439, New York, Aug. 1998. [20] L. A. Zadeh, “Fuzzy sets,” Information and Control, vol, 8, pp. 338-353, C.L.1965. [21] F. Bashir. A. Khokhar, D. Schonfeld, “Automatic object trajectory-based motion recognition using Gaussian mixture models”, IEEE conf.Multimedia and Expo,2005. [22] W. M. Hu, D. Xie, and T. N. Tan, “A hierarchical self-organizing approach for learning the patterns of motion trajectories,” IEEE Trans.on neural networks,2004. [23] J. S. Zhang and T. W. Leung, “Improved Possibilistic C-Means Clustering Algorithms”, IEEE Trans. On Fuzzy System, 2004. [24] Roger Jang, “Data Clustering and Pattern Recognition“ http://neural.cs.nthu.edu.tw/jang/books/dcpr/ [25] M. B. Eisen, P. T. Spellman, P. O. Brown and D. Botstein, “Cluster analysis and display of genome-wide expression patterns,” Proc. Natl. Acad. Sci. USA, vol. 95,pp. 14863-14868,1998. [26] M. B. Eisen, “Cluster Analysis and Visualization” http://rana.lbl.gov/EisenSoftware.htm
摘要: 在這篇論文中首先提出群集分析所常面臨的問題,而解決這些問題的能力也將成為評估群集方法優劣的性能指標。論文一開始回顧各種傳統群集方法的理論基礎,並檢視其中的涵義與優缺,進而形成發展群集方法的背景知識。而後介紹一套以相似度為基礎的穩健群集方法,分析其中的利弊得失從中得到新的啟蒙,為本篇論文搭起了一個基礎的架構。並利用先前學習的背景知識判斷此群集方法的成敗關鍵,加以改良或移除,進而形成一套脫穎而出的穩健分群方法,稱之為最鄰近相似度群集法。最後經由測試不同的資料型態並比較兩者結果的優劣,驗證了群集方法的確獲得大幅的改善,並克服了傳統群集方法所常面臨的問題。
This research first discusses several problems that often occur in conventional clustering analysis. The capability of solving these problems can be used as the performance index of clustering methods. Then basic theories of several conventional clustering methods are reviewed and the survey of advantages and disadvantages is included. Based on these discussions, we introduce a nearest neighbor similarity-based robust clustering method (NN-SCM) by discovering the fundamental structure of the data set. Finally, several different data sets are used to compare the performances of the proposed approach and the traditional similarity-based clustering method. Simulation results indicate that the proposed approach has better clustering performance with less computation time while avoiding most of the conventional clustering problems.
URI: http://hdl.handle.net/11455/7783
其他識別: U0005-2808200713134300
文章連結: http://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0005-2808200713134300
Appears in Collections:電機工程學系所

文件中的檔案:

取得全文請前往華藝線上圖書館



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.