Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/96884
標題: 一個偵測社群網路中假帳號的研究
A Study of Detecting Bot Users in Social Networks
作者: 杜謙和
Chian-He Du
關鍵字: 深度學習
限制波茲曼機
機器學習
社群網路
Deep Learning
Restricted Boltzmann Machine
Machine Learning
Social Network
引用: [1]R. Johnson and G. Bhattacharyya, Statistics: Principles and Methods, John Wiley & Sons Inc, 2005. [2]M. Belkin, P. Niyogi and V. Sindhwani, 'Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,' Journal of Machine Learning Research, 7(NOV), pp. 2399-2434, 2006. [3]Y. Bengio, 'Learning Deep Architectures for AI,' Foundation and Trends in Machine Learning, 2(1), pp. 1-127, 2009. [4]Y. Boshmaf, I. Muslukhov, K. Beznosov and M. Ripeanu, 'The Socialbot Network:When Bots Socialize for Fame and Money,' in Proceedings of the 27th Annual Computer Security Applications Conference, pp. 93-102, 2011. [5]L. Bottou, 'Large-Scale Machine Learning with Stochastic Gradient Descent,' in Proceedings of the 19th International Conference on Computational Statistics, pp. 177-187, 2010. [6]M. Carreira-Perpinan and G. Hinton, 'On Contrastive Divergence Learning,' in Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, pp. 33-40, 2005. [7]Z. Chu, S. Gianvecchio, H. Wang and S. Jajodia, 'Who is Tweeting on Twitter: Human, Bot, or Cyborg?' in Proceedings of the 26th Annual Computer Security Applications Conference, pp. 21-30, 2010. [8]A. F. Costa, Y. Yamaguchi, A. Traina, C. J. Traina and C. Faloutsos, 'RSC: Mining and Modeling Temporal Activity in Social Media,' in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269-278, 2015. [9]H. Goh, N. Thome, M. Cord and J.-H. Lim, 'Unsupervised and Supervised Visual Codes with Restricted Boltzmann Machine,' in Proceedings of the 12th European Conference on Computer Vision, pp. 298-311, 2012. [10]I. J. Goodfellow, D. W. Farley, M. Mirza, A. Courville and Y. Bengio, 'Maxout Network,' in Proceedings of the 30th International Conference on Machine Learning, pp. 1319-1327, 2013. [11]G. Hinton, 'Training Products of Experts by Minimizing Contrastive Divergence,' Journal of Neural Computation, 14(8), pp. 1771-1800, 2002. [12]G. Hinton and R. Salakhutdinov, 'Deep Boltzmann Machines,' in Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 448-455, 2009. [13]A. Hossain and W. Zhang, 'Privacy and Security Concern of Online Social Networks from User Perspective,' in Proceedings of the International Conference on Information Systems Security and Privacy, pp. 246-253, 2015. [14]L. Jin, T. Wang, P. Hui and A. Vasilakos, 'Understanding User Behavior in Online Social Networks: A Survey,' IEEE Communications Magazine, 51(9), pp. 144-150, 2013. [15]H. Larochelle, M. Mande, R. Pascanu and Y. Bengio, 'Algorithms for the  Classification Restricted Boltzmann Machine,' Journal of Machine Learning Research, 13(2012), pp. 643-669, 2012. [16]R. Memisevic, C. Zach, G. Hinton and M. Pollefeys, 'Gated Softmax Classification,' in Proceedings of the 24th Annual Conference on Neural Information Processing System, pp. 1603-1611, 2010. [17]E. Mustafaraj and S. D. Anderson, 'Learning About Machine Learning: An Extended Assignment to Classify Twitter Accounts,' in Proceedings of the 24th International Florida Artificial Intelligence Research Society Conference, pp. 376-381, 2011. [18]N. Qian, 'On the Momentum Term in Gradient Descent Learning Algorithms,' Journal of Neural Networks, 12(1), pp. 145-151, 1999. [19]D. Rumelhart, G. Hinton and R. Williams, 'Learning Representation by Back-Propagation Errors,' Nature, 323(6088), pp. 533-536, 1986. [20]R. Salakhutdinov and G. Hinton, 'Restricted Boltzmann Machines for Collaborative Filtering,' in Proceedings of the 24th Annual International Conference on Machine Learning, pp. 791-798, 2007. [21]N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever and R. Salakhutdinov, 'Dropout: A Simple Way to Prevent Neural Networks from Overfitting,' The Journal of Machine Learning Research, 15(1), pp. 1929-1958, 2014. [22]R. Wald, T. Khoshgoftaar, A. Napolitano and C. Sumner, 'Predicting Susceptibility to Social Bots on Twitter,' in Proceedings of the IEEE 14th International Conference on Information Reuse and Integration, pp. 6-13, 2013. [23]E. Frank, M. Hall and I. Witten, 'The WEKA Workbench. Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques,' http://www.cs.waikato.ac.nz/ml/weka/Witten_et_al_2016_appendix.pdf. [24] G. Hinton, N. Srivastava and K. Swersky, 'Lecture 6: Overview of Mini-Batch Gradient Descent, Coursera Lecture slides,' https://class.coursera.org/ neuralnets-2012-001/lecture. [25]Instagram API, https://www.instagram.com/developer/. [26] D. Kingma and J. Ba, 'Adam: A Method for Stochastic Optimization,' https://arxiv.org/pdf/1412.6980.pdf. [27] Power Query to API, http://www.excel2013.info/power-query/connection-api/.
摘要: 近年來,由於社群網路的蓬勃發展與廣泛應用,使我們的生活與網路越來越密不可分,針對社群媒體中假帳號的相關議題和研究也越來越受到重視。過去研究方法多從原始資料中擷取大量的帳號特徵,然而藉由擷取帳號特徵做假帳號的辨識需要非常大量的已標籤資料才可以找出分類依據,這使得在蒐集資料時非常的耗時。本研究的目的在於提出更有效的分類方法以辨識假帳號的問題。本研究的貢獻在於提出以時間資料為基底的多種行為特徵,並透過統計方法和實驗比較,證明行為特徵有其分類的影響力。同時,本研究也提出了一深度學習方法,即階層式受限波茲曼機(Hierarchical-Restricted Boltzmann Machine, H-RBM),來幫助我們更有效地學習假帳號特徵。實驗結果顯示,針對較大的資料集本研究提出的分類方法其F-Measure可以比傳統分類器高約2至3個百分點。我們認為這個新方法有助於為社群媒體更準確判別真假帳號。
In recent years, due to the vigorous development and wide use of social networks, our lives to the networks are more and more closed related. Many new issues in social networks have drawn attention from the research community, in particular, detecting the bot users in social media. To deal with this issue, most research methods extract a large number of features of user accounts from raw data. However, to recognize the bot users by extracting features of user accounts usually requires a lot of tagged data before finding the basis of classification, making the data collection very time consuming. The aim of this study is to propose a more effective classification method to classify and then detect the bot users in social networks. The contribution of this study lies in presenting a variety of behavioral features based on timestamp data, and through statistical methods and experimental comparison, proving the effectiveness of the classification based on the behavioral features. At the same time, this study also proposes a deep learning method, namely Hierarchical-Restricted Boltzmann Machine (H-RBM), to help us learn more effectively. In experiments, the F-Measure of our proposed approach can be about 2 to 3 percentage higher than that of the traditional classifiers for large datasets. We believe that our proposed approach could provide an effective bot-user classification method to the social media applications.
URI: http://hdl.handle.net/11455/96884
文章公開時間: 2020-08-14
Appears in Collections:資訊科學與工程學系所

文件中的檔案:

取得全文請前往華藝線上圖書館



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.