Please use this identifier to cite or link to this item:
標題: 基於對稱距離的分群演算法之研究
A Study of Clustering Algorithm Based on Central Symmetry
作者: 黃朱玄 
Huang, Chu-Hsuan 
關鍵字: K-means;K-means;clustering;symmetry distance;Euclidean distance;分群;對稱性距離;歐幾里德距離
出版社: 資訊科學系所
引用: [1] F. Attneave, “Symmetry information and memory for pattern”, The American Journal of Psychology, Vol. 68, N0.2, pp.209-222, 1995. [2] J. Yilin, H. Peng, J.M. Xie and Q.L. Zheng, “Novel Clustering Algorithm based on Central Symmetry”, Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, Vol.3, 26-29 August, pp. 1329 - 1334, 2004. [3] M.C. Su and C.H. Chou, “A Modified Version of the K-means Algorithm with a Distance Based on Cluster Symmetry”, IEEE Trans. Pattern Analysis and Machine Intelligence, Volume 23, Issue 6, pp. 674 - 680, June 2001. [4] R.O. Duda, P.E. Hart, D.G. Stork, “Pattern Classification (2nd ed.)”, Wiley Interscience, 2001. [5] D Reisfild, H. Wolfsow, and Y. Yeshurun, “Context-Free Attentional Operators: the Generalized Symmetry Transform.”, International Journal of Computer Vision, Vol.14, No.2, pp.119-130, 1995. [6] J MacQueen, “Some methods for classification and analysis of multivariate observations”, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, California, pp.281-291, 1967. [7] K. Kanatani, “Comments on Symmetry as Continuous Feature.”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol.19, no.3, pp.246-247, Mar. 1997.
K-means 演算法是一個十分普及的分類演算法,它廣泛的運用在各種工程以及科學領域上,像是影像分割(Image segmentation)、圖形識別(Pattern classification)與資料探勘(Data mining)等。在一般的K-means演算法中各個資料與群組中心的距離測量方式是使用歐幾里德距離作為基準。而基於對稱性距離之K-means演算法( Symmetry distance Based K-means algorithm, SBKM )被提出來解決當資料分佈型態呈現幾何對稱結構的情形,在SBKM演算法中使用對稱性距離取代原本的歐幾里德距離。但是SBKM在某些資料分佈下,其會發生分類錯誤的問題,經由實驗發現當群組與群組之間有高度對稱性的情形下它會產生分類錯誤的情形。在本論文中我們提出改進的方法,藉由使用歐幾里德距離與對稱性距離混合的方式作為分群基準來解決上述的問題。實驗結果中顯示所提之演算法在各種不同的分布資料皆可得到理想的結果。

K-means clustering is a very popular clustering technique which is widely used in numerous engineering and scientific disciplines such as image segmentation, pattern recognition and data mining. The similarity measure used in the conventional K-means algorithm is Euclidean distance. The symmetry-based K-means (SBKM) algorithm was proposed to cluster the data sets with the geometrical symmetrical structure, it used the “point symmetry distance” instead of “Euclidean distance” as the similarity measure. However, in practice, it does not work well in the case of where there is highly symmetry between the clusters. In this paper, a modified version of the SBKM algorithm is proposed to solve the above problem. We use a hybrid distance measure by the combination of the symmetry distance and the Euclidean distance. Several examples are used to demonstrate the robustness and effectiveness of the proposed algorithm.
其他識別: U0005-2405200603295300
Appears in Collections:資訊科學與工程學系所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.