Please use this identifier to cite or link to this item:
標題: 多階層混合式分類器之異常偵測模型
A Model of Anomaly Detection Based on Multiple Level Hybrid Classifier
作者: 邱耿義
Chiu, Keng-Yi
關鍵字: anomaly detection
data mining
Random Forest
Radial Basis Function Network
出版社: 資訊科學與工程學系所
引用: [1]梁定澎,決策支援系統與企業智慧,智勝文化,2006。 [2]尹相志,SQL Server 2005 Data Mining資料採礦與Office 2007資料採礦增益集,悦知文化,2007。 [3] Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining, Pearson Education Publishing, 2006. [4] L. Breiman, “Random Forest,” Machine Learning 45(1), 2001, pp.5-32. [5] Yoav Freund and Robert E. Schapire, “Experiments with a New Boosting Algorithm,” Proceedings of the Thirteenth International Conference on Machine Learning, 1996. [6] Dietterich Thomas G, “Ensemble methods in machine learning,” Lecture Notes in Computer Science 1857, 2000, pp.1-15. [7] J. Zhang and M. Zulkernine, “Network Intrusion Detection Using Random Forest,” Proceedings of the third Annual Conference on Privacy, Security and Trust (PST), 2005, pp.53-61. [8] L. Breiman, “Bagging Predictors,” Machine Learning, Vol. 24, No. 2, 1996, pp.123-140. [9] Yoav Freund and Robert E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences, 1997, 55(1):119-139. [10] L.Prema Rajeswari and A.Kannan, “An Intrusion Detection System Based on Multiple Level Hybrid Classifier using Enhanced C4.5,” Proceedings of IEEE International Conference on Signal,Communications and Networking Madras Institute of Technology, 2008, pp.75-79. [11] Jungsuk Song, Hiroki Takakura and yasuo Okabe, “A proposal of new benchmark data to evaluate mining algorithms for intrusion detection,” Proceedings of 23rd Asia Pacific Advanced Net-working Meeting, 2007. [12] Sabhnani M. and Serpen G., “Why Machine Learning Algorithms Fail in Misuse Detection on KDD Intrusion Detection Data Set,” Journal of Intelligent Data Analysis, 2004. [13] Chi-hoon Lee, Sung-woo Shin and Jin-wook Chung, “Network Intrusion Detection Through Genetic Feature Selection,” ISNPD2006(IEEE), 2006, pp.109-114. [14] Amine Bsila, Sylvain Gombault and Abdelfateh Belghith, “Improving traffic transformation function to detect novel attacks,” Proceedings of 4th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications, 2006. [15] H.Gunes Kayacik, A. Nur Zincir-Heywood, Malcolm I. Heywood, “Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD’99 Intrusion Detection Datasets,” Proceedings of the Third Annual Conference on Privacy, Security and Trust, 2005. [16] V. Paxson, “Bro: A System for Detecting Network Intruders in Real-Time,” Proceedings of 7th conference on USENIX Security Symposium, 1998. [17] Sophos Security Threat Report 2009 [18] KDD-CUP-99 Task Description [19] Neural network-Wikipedia [20] Random Forest-classification description.
摘要: 隨著網路寬頻時代的來臨及網路應用蓬勃發展,駭客攻擊事件層出不窮。入侵者手法不斷精進,利用各種自動化攻擊工具,進行快速、多樣的攻擊,藉以入侵、取得權限或阻斷伺服主機之服務能力,已成為網路安全一大威脅。入侵偵測系統可說是此威脅的一道防線,可以協助管理人員發現異常行為並進行緊急處理,將損害降至最低。 本篇論文利用資料探勘技術,提出多階層混合式分類器之異常偵測模型。由於每種分類演算法各有其優缺點,單一分類演算法對於不同類型的資料,無法都獲得最佳的預測結果。此外在建構分類模型時,訓練樣本及連線特徵的選擇是否恰當,也會直接影響預測的準確率。我們的模型利用隨機森林(Random Forest)及放射性基底函數網路(Radial Basis Function Network)兩種分類技術,並以KDD''99資料集做為訓練及測試資料來源。此資料集來自MIT林肯實驗室模擬網路環境之連線資料,原始資料經過轉換後取出含有41個特徵之連線記錄。 模型為三層式架構,每一層分類器由不同的訓練樣本及特徵經過訓練學習後產生,負責偵測不同類型的連線。依據每層分類器的偵測效能,產生兩種不同組合的異常偵測模型。實驗結果顯示,這兩種模型的整體偵測率分別為93.02%及93.11%,優於KDD''99競賽優勝者的92.71%,每筆測試樣本平均成本分別為0.2285及0.2254,也低於KDD''99競賽優勝者的0.2331。此外針對測試資料集中的Snmpguess攻擊因特徵轉換隱藏的問題,造成此一攻擊會被誤判為正常連線,我們透過分析此種攻擊的連線特性,並重新取出連線特徵,修正後此類型攻擊的偵測率由0.08%提升至100%。
In the age of network broadband and booming development of network applications, we are seeing hacker attacks one after one. Intruders are using better and better skills to attack rapidly in diverse ways with variables automatic attack tools in order to intrude and acquire privileges or render a server incapable of providing services, having caused a great threat to network security. Intrusion Detection System is the last defense to such a threat, helping managerial persons locate anomaly behaviors to take emergency actions and minimize the loss. With data mining techniques, this thesis proposes model of anomaly detection based on multiple level hybrid classifier. As each classification algorithm has its advantages and drawbacks, single classification algorithm fails to obtain best prediction results on different types of data. Also, in constructing a classification model, training data and connection features choice will directly affect prediction accuracy. The model is under Random Forest and Radial Basis Function Network classification techniques with the source of KDD''99 datasets for training and testing data. The source is created by MIT Lincoln Labs simulation network environment connection data. After transformation of raw data, 41 features for each connection record are extracted. The model is divided into three-level architecture. Each level''s classifier is generated by different training data and features after training, to be in charge of detecting different types of connections. Based on each level''s classifier detection performance, we have different combinations of anomaly detection models. Experimental results show that the two models'' overall detection rates are 93.02% and 93.11%, outweighing 92.71% of KDD''99 Cup Winner. The average cost per test example is 0.2285 and 0.2254, lower than 0.2331 of KDD''99 Cup Winner. In testing datasets, Snmpguess attack is misjudged as normal connection due to hidden feature transformation. We analyzed this attack connection and re-extract characteristics of the connection. The modified attack detection rates go up from 0.08% to 100%.。
其他識別: U0005-1608200923033700
Appears in Collections:資訊科學與工程學系所



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.