Please use this identifier to cite or link to this item:
標題: 應用卷積神經網路和遷移學習於學習情緒辨識之研究
The Study on Recognizing Learning Emotion Based on Convolutional Neural Networks and Transfer Learning
作者: 賴念翔
Nian-Xiang Lai
關鍵字: 學習情緒;臉部表情辨識;卷積神經網路;遷移學習;模型泛化能力;Learning Emotion;Facial Expression Recognition;Convolutional Neural Network;Transfer Learning;Model Generalization.
引用: [1] E. Fox, Emotion Science: An Integration of Cognitive and Neuroscientific Approaches, New York: Palgrave MacMillan, 2008. [2] R. Picard, Affective Computing, MIT, Media Laboratory, 1995. [3] MITTELMANN BELA M.D, WOLFF, HAROLD G. M.D., Emotions and Skin temperature: Observations on Patients During Psychotherapeutic (Psychoanalytic) Interviews1, 1943. [4] Paul Salvador Inventado, Roberto Legaspi, The Duy Bui, Merlin Suare, Predicting student's appraisal of feedbask in an ITS using previous affective states and continuous affect labels from EEG data. [5] A. Mehrabian, Silent Messages(1st ed.), Belmont, CA: Wadsworth, 1971. [6] Abdi. H., & Williams, L.J., 'Principal component analysis,' Wiley Interdisciplinary Reviews: Computational Statistics,, pp. 433-459, 2010. [7] Isabelle Guyon & Andre Elisseeff, 'An Introduction to Variable and Feature Selection,' Journal of Machine Learning Research, pp. 11559-1182, 3 2003. [8] M.WGardner & S.RDorling, 'Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences,' Atmospheric Environment, pp. 2627-2636, 1992. [9] Corinna Cortes, Vladimir Vapnik, 'Support-Vector Networks,' Machine Learning, pp. 273-297, 15 5 1995. [10] Altman, 'An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression,' The American Statician, pp. 175-185, 1992. [11] Ekman, Paul, Davidson, Richard J, The nature of emotion: Fundamental questions., 1994. [12] Rafael A. Calvo, Sidney D'Mello, 'Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications,' IEEE Transactions on Affective Computing, pp. 18-37, 23 7 2010. [13] Pekrun, R., Goetz, T., Titz, W., & Perry, R. P., 'Academic emotions in students' self-regulated learning and achievement: A program of qualitative and quantitative research.,' Educational psychologist, pp. 91-105. [14] D'Mello, R.S. Taylor, and A. Graesser, 'Monitoring Affective Trajectories during Complex Learning,' Proceedings of the 29th Annual Meeting of the Cognitive Science Society, pp. 203-208, 2007. [15] 林敏勤, 臉部為表情辨識系統之開發與應用, 台中市: 國立中興大學, 2016. [16] 吳俊霖, 臉部表情分析應用於數位學習, 花蓮, 2014. [17] 陳緯, 數位學習環境增加社會臨場感對於自我調整能力、學習動機與學習成就之影響-以國小高年級學生為例, 雲林: 國立雲林科技大學資訊管理學系碩士論文, 2009. [18] 蘇信宏, 數位學習情意偵測專心程度之影像處理, 台北: 北台灣科學技術學院機電整合研究所論文, 2007. [19] R. Breuer, A Deep Learning Perspective on the Origin of Facial Expressions, 2017. [20] Jason Chi-Shun Hung., Kun-Hsiang Chiang., Yi-Hung Huang., Kuan-Cheng Lin, 'Augmenting teacher-student interaction in digital learning through affective computing,' Multimedia Tools and Applications, pp. 1-26, 2016. [21] P. Ekman and W. Friesen, 'Facial Action Coding System: A Technique for the Measurement of Facial Movement,' in Consulting Psychologists Press, Palo Alto, 1978. [22] K. C. Lin, T.-C. Huang, J. C. Hung, N. Y. Yen, and S. J. Chen,, 'Facial Emotion Recognition towards Affective Computingbased Learning,' Library Hi Tech, vol. 31, no., pp. 294-307, 2013. [23] Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger, Densely Connected Convolutional Networks, 2016. [24] zuheng.ming, joseph.chazalon, mluqma01, muriel.visani, jcburie, FaceLiveNet: End-to-End Face Verification Networks Combining With Interactive Facial Expression-based Liveness Detection, Parise, France: University of La Rochelle, 2018. [25] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, 'ImageNet classification with deep convolution neurl networks,' Proceedings of the 25th International Conference on Neural Information Processing Systems., pp. 1097-1105, 2012. [26] Karen Simonyan, Andrew Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, 2014. [27] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, Going Deeper with Convolutions, Google, 2014. [28] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Rethinking the Inception Architecture for Computer Vision, 2015. [29] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition, 2015. [30] Patrick Lucey , Jeffrey F. Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, Iain Matthews, 'The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression,' 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 13-18, 9 8 2010. [31] Lundqvist, D., Flykt, A., Öhman, A, 'The Karolinska Directed Emotional Faces - KDEF,' 1998. [Online]. Available: [32] M. J. Lyons, M. Kamachi and J. Gyoba, 'Japanese Female Facial Expressions (JAFFE),' Database of digital images, 1997. [33] Ian J. GoodfellowDumitru ErhanPierre Luc CarrierAaron CourvilleMehdi MirzaBen HamnerWill CukierskiYichuan TangDavid ThalerDong-Hyun LeeYingbo ZhouChetan RamaiahFangxiang FengRuifan LiXiaojie Wang, 'Challenges in Representation Learning: A Report on Three Machine Learning Contests,' ICONIP 2013: Neural Information Processing, pp. 117-124. [34] FRANK Y. SHIH, CHAO-FA CHUANG, PATRICK S. P. WANG, 'PERFORMANCE COMPARISONS OF FACIAL EXPRESSION RECOGNITION IN JAFFE DATABASE,' International Journal of Pattern Recognition and Artificial Intelligence Vol. 22, No. 3 , pp. 445-459, 2008. [35] P. R. Dachapally, Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units, 2017. [36] J. Liang, Design of an Automatic Facial Expression Detector, 2018. [37] Maxime Oquab, Leon Bottou, Ivan Laptev, 'Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks,' in 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014. [38] Pan S J, Yang Q, ' A survey on transfer learning,' IEEE Transactions on knowledge and data engineering, pp. 1345-1359, 22 10 2010. [39] Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, 'Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps,' in CoRR, vol. 1312.6034, 2013. [40] 鍾沛儒, 利用臉部動作單元辨識學習情緒之研究, 台中市: 國立中興大學, 2018. [41] Ira Cohen, Nicu Sebe, Ashutosh Garg, Lawrence S. Chen, Thomas S. Huang, 'Facial expression recognition from video sequences: temporal and static modeling,' Computer Vision and Image Understanding, vol. 91, pp. 160-187, 2003. [42] Tong Zhang, Mark Hasegawa-Johnson, and Stephen Levinson, 'Children's emotion recognition in an intelligent tutoring scenario,' in Proceedings of 8th European Conference on Spoken Language Processing, korea, 2014. [43] Jeffrey F. Cohn, Lawrence Ian Reed, Zara Ambadar, Jing Xiao, Tsuyoshi Moriyama, 'Automatic analysis and recognition of brow actions and head motion in spontaneous facial behavior,' in IEEE International Conference on Systems, Man and Cybernetics, 2004. [44] Liping Shen, Minjuan Wang, Ruimin Shen, 'Affective e-learning: using 'emotional' data to improve learning in pervasive learning environment,' Educational Technology & Society, vol. 12, no. 2, pp. 176-189, 2009. [45] Xavier Glorot, Antoine Bordes, Yoshua Bengio, 'Deep sparse rectifier neural networks,' AISTATS, 2011. [46] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J, ' Learning representations by back-propagating errors,' Nature, pp. 533-536, 8 10 1986. [47] Jun HanClaudio Moraga, 'The influence of the sigmoid function parameters on the speed of backpropagation learning,' Computational Models of Neurons and Neural Nets, pp. 195-201, 1 6 2005. [48] Hsuan-Tien Lin, Chih-Jen Lin, A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods, 2003. [49] Cannon, Walter B, 'Biographical Memoir, Henry Pickering Bowditch,' in National Academy of Sciences, Volume xvii, eighth memoir., Washington, D.C, 1924. [50] Prajit Ramachandran, Barret Zoph, Quoc V. Le, Searching for Activation Functions, 2017. [51] Sergey Ioffe, Christian Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Google, 2015. [52] Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber, Highway Networks, 2015. [53] Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi, Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, 1600 Amphitheatre Pkwy, Mountain View, CA: Google Inc, 2016. [54] Ch. Bencheriet, H. Tebbikh , 'Faces Detection in Color Image VS Gray Scale,' in Proceedings of the 2015 International Conference on Industrial Engineering and Operations Management, Dubai, United Arab Emirates (UAE), 2015. [55] Juwei Lu, Konstantinos N. Plataniotis, 'On conversion from color to gray-scale images for face detection,' in 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009. [56] David E. GoldbergJohn H. Holland, 'Genetic Algorithms and Machine Learning,' Machine Learning, pp. 95-99, 10 1988. [57] Yoav Freund and Robert E. Schapire, 'A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,' journal of computer and system sciences, pp. 119-139, 1997. [58] G Bradski, A Kaehler, 'OpenCV,' in Dr. Dobb's journal of software tools, 2000. [59] P. Viola, M. Jones, 'Rapid object detection using a boosted cascade of simple features,' in Computer Society Conference on Computer Vision and Pattern Recognition, 2001. [60] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017. [61] Min Lin, Qiang Chen, Shuicheng Yan, Network In Network, 2013. [62] P. Baldi, 'Autoencoders, Unsupervised Learning, and Deep Architectures,' JMLR: Workshop and Conference Proceedings 27, pp. 37-50, 2012. [63] X. Z. S. R. J. S. Kaiming He, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, 2015. [64] Diederik Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, 2014. [65] N. Morgan, H.bourlard, 'Generalization and Parameter Estimation in Feedforward Nets: Some Experiments,' Morgan and Bourfard , pp. 630-637. [66] 吳政德, '利用卷積神經網路預測學習情緒之研究,' 碩士論文,中興大學資訊管理學系, 台中市, 2017. [67] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, L, TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, Google, 2016. [68] Kohavi, Ron, 'A study of cross-validation and bootstrap for accuracy estimation and model selection,' in Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Mateo, 1995. [69] David Eigen, Christian Puhrsch, Rob Fergus, Depth Map Prediction from a Single Image using a Multi-Scale Deep Network, 2014. [70] Ali Mollahosseini, David Chan, Mohammad Mahoor, 'Going deeper in facial expression recognition using deep neural networks,' IEEE Winter Conference on Applications of Computer Vision, 26 5 2016. [71] 林裕芳, 建立學習情緒動態轉移機制預測模型及基於不同先備知識下與學習成效相關性, 台中市: 國立中興大學, 2018. [72] 劉思廷, 透過學習情緒轉移機制探討問題解決學習情境下不同先備知識對學習成效之影響, 台中市: 國立中興大學, 2018. [73] Li Fei-Fei, Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, 'Imagenet: A large-scale hierarchical image database,' CVPR 2009, pp. 248-255, 20 6 2009.

研究指出,卷積神經網路臉部在基本情緒辨識有不錯的表現。因為,卷積神經網路不像傳統機器學習需要手工設計特徵,而是自動學習到整張影像中必要的特徵。本論文改良在基本情緒辨識有著低參數高準確率的FaceLiveNet網路架構,提出了Dense_FaceLiveNet之架構。利用Dense_FaceLiveNet進行兩個階段的遷移學習。首先,從資料較為簡單的JAFFE與KDEF基本情緒辨識模型建立遷移學習至FER2013基本情緒資料集並且獲得了70.02%的準確率。次之,利用FER2013基本情緒辨識模型進行遷移學習,建立學習情緒辨識模型,其測試準確率高達91.93%,比起未使用遷移學習模型之辨識準確率79.03% 高出12.03%,證明使用遷移學習可以有效提升學習情緒辨識模型之辨識準確率。


In classroom teaching, if teachers want to understand the learning effectiveness of learners, they often collect and analyze data through quizzes or questionnaires, but they can't receive real-time feedback. The learner's facial emotions are highly correlated with learning motivation and effectiveness. Recognizing the learner's facial emotions through the system helps the learners to understand the learning situations of themselves and make teachers provide help and improvement in class teaching.

Research indicates that the Convolution Neural Network (CNN) in basic emotions face recognition has a good performance. Because CNN do not require hand-designed features like traditional machine learning, they automatically learn the necessary features of the entire image. We improves the CNN architecture FaceLiveNet which has low parameter and high accuracy in basic emotion recognition, and proposes Dense_FaceLiveNet architecture. We use Dense_FaceLiveNet for two-phases of transfer learning. First, from the relatively simple data JAFFE and KDEF basic emotion recognition model transferring to the FER2013 basic emotion dataset and obtained an accuracy of 70.02%. Secondly, using the FER2013 basic emotion recognition model transferring to learning emotion recognition model, the test accuracy rate is as high as 91.93%, which is 12.03% higher than the accuracy rate of 79.03% without using the transfer learning model, which proves that the use of transfer learning can Effectively improve the recognition accuracy of learning emotion recognition model.

In addition, in order to test the generalization ability of the learning emotion recognition model, videos recorded by students from a national university in Taiwan during class learning were used as test data. The original database of learning emotions did not consider that students would have exceptions such as over eyebrows, eyes closed and hand hold the chin etc. To improve this situation, after adding the learning emotion database to the images of the exceptions mentioned above, the model was re-build, and the recognition accuracy rate of the model was 91.42%. By comparing the output of the maps, the model does have the characteristics of success in learning images such as eyebrows, chins, and eyes closed. Further, after combining all the students' image data with the original learning emotion database, the model was re-build and obtained the accuracy rate reached 84.59%. The result proves that the learning emotion recognition model can achieve high recognition accuracy by processing the unlearned image through transfer learning.
Rights: 不同意授權瀏覽/列印電子全文服務
Appears in Collections:資訊管理學系

Files in This Item:
File SizeFormat Existing users please Login
nchu-107-7105029031-1.pdf3.23 MBAdobe PDFThis file is only available in the university internal network    Request a copy
Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.