Emotion Recognition using AutoEncoders and Convolutional Neural Networks
Abstract
Emotions demonstrate people's reactions to certain stimuli. Facial expression analysis is often used to identify the emotion expressed. Machine learning algorithms combined with artificial intelligence techniques have been developed in order to detect expressions found in multimedia elements, including videos and pictures. Advanced methods to achieve this include the usage of Deep Learning algorithms. The aim of this paper is to analyze the performance of a Convolutional Neural Network which uses AutoEncoder Units for emotion-recognition in human faces. The combination of two Deep Learning techniques boosts the performance of the classification system. 8000 facial expressions from the Radboud Faces Database were used during this research for both training and testing. The outcome showed that five of the eight analyzed emotions presented higher accuracy rates, higher than 90%.
References
Garn, A. C., Simonton, K., Dasingert, T., Simonton, A.: Predicting changes in student engagement in university physical education: Application of control-value theory of achievement emotions. Psychology of Sport and Exercise (29), 93-102 (2017).
Fernandez-Caballero, A., Martinez-Rodrigo, A., Pastor, J. M., Castillo, J. C., Lozano-Monasor, E., Lopez, M. T., Zangroniz, R., Latorre, J. M., Fernandez-Sotos, A.: Smart environment architecture for emotion detection and regulation. Journal of Biomedical Informatics (64), 57-73 (2016).
Felbermayr, A., Nanopoulos, A.: The Role of Emotions for the Perceived Usefulness in Online Customer Reviews. Jounal of Interactive Marketing (36), 60-76 (2016).
Gennari, R., Melonio, A., Raccanello, D., Brondino, M., Dodero, G., Pasini, M., Torello, S.: Children's emotions and quality of products in participatory game design. International Journal of Human-Computer Studies (101), 45-61 (2017).
Campos, V., Jou, B., Giró-i-Nieto, X.: From pixels to sentiment: Fine-tuning CNNs for visual sentiment prediction. Image and Vision Computing (65), 15-22 (2017).
Mannepalli, K., Sastry, P. N., Suman, M.: A novel Adaptive Fractional Deep Belief Networks for speaker emotion recognition. Alexandria Engineering Journal (56), 485-497 (2017).
Chai, X., Wang, Q., Zhao, Y., Liu, X., Bai, O., Li, Y.: Unsupervised domain adaptation techniques based on autoencoder for non-stationary EEG-based emotion recognition. Computers in Biology and Medicine (79), 205-214 (2016).
Affonso, C., Rossi, A. L. D., Vieira, F. H. A., Ferreira de Carvalho, A. C. P. de L.: Deep learning for biological image classification. Expert Systems with Applications (85), 114-122 (2017).
Fayek, H. M., Lech, M., Cavedon, L.: Evaluating deep learning architectures for Speech Emotion Recognition. Neural Networks (92), 60-68 (2017).
Roy, S., Das, N., Kundu, M., Nasipuri, M.: Handwritten isolated Bangla compound character recognition: A new benchmark using a novel deep learning approach. Pattern Recognition Letters (90), 15-21 (2017).
Gopalakrishnan, K., Khaitan, S. K., Choudhary, A., Agrawal, A.: Deep Convolutional Neural Networks with transfer learning for computer vision-based data-driven pavement distress detection. Construction and Building Materials (157), 322-330 (2017).
Shin, H., Lu L., Summers, R. M.: Deep Learning for Medical Image Analysis. Academic Press, USA (2017).
Yogesh, C. K., Hariharan, M., Ngadiran, R., Adom, A. H., Yaacob, S., Polat, K.: Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech. Applied Soft Computing (56), 217-232 (2017).
OpenCV Homepage, https://opencv.org, [Online; accessed 05-Feb-2018]
Tensorflow Homepage, https://www.tensorflow.org/, [Online; accessed 05-Feb-2018]
Theano Homepage, http://deeplearning.net/software/theano/, [Online; accessed 05-Feb-2018]
Caffe Homepage, http://caffe.berkeleyvision.org/, [Online; accessed 05-Feb-2018]
CNTK Homepage, https://www.microsoft.com/en-us/cognitive-toolkit/, [Online; accessed 05-Feb-2018]
Lawrence, K., Campbell R., Skuse D.: Age, gender, and puberty influence the development of facial emotion recognition. Frontiers in Psychology (6), 1-14 (2015).
Ronao, C. A., Cho, S.: Human activity recognition with smartphone sensors using deep learning neural networks. Expert Systems with Applications (59), 235-244 (2016).
Zuo, Y., Zeng, J., Gong, M., Jiao, L.: Tag-aware recommender systems based on deep neural networks. Neurocomputing (204), 51-60 (2016)
Liao, S., Wang, J., Yu, R., Sato, K., Cheng, Z.: CNN for situations understanding based on sentiment analysis of twitter data. Procedia Computer Science (111), 376-381 (2017).
Typical cnn.png. file, https://commons.wikimedia.org/w/index.php?title=File:Typical_cnn.png [Online; accessed 09-Feb-2018]
Masci, J., Meier, U., Cires¸an, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In 21th International Conference on Artificial Neural Networks on Proceedings, pp. 52–59. Springer Berlin Heidelberg, Berlin (2011).
Bengio, Y., Lamblin, P., Popovici, D., Larochelle. H.: Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems. MIT Press, USA (2007).
Zeng, N., Zhang, H., Song, B., Liu, W., Li, Y., Dobaie, A. M.: Facial expression recognition via learning deep sparse autoencoders. Neurocomputing (273), 643-649 (2018),
Mayya, V., Pai, R. M., Pai, M. M. M.: Automatic Facial Expression Recognition Using DCNN. Procedia Computer Science (93), 453-461 (2016).
Pitaloka, D. A., Wulandari, A., Basaruddin, T., Liliana D. Y.: Enhancing CNN with Preprocessing Stage in Automatic Emotion Recognition. Procedia Computer Science (116), 523-529 (2017).
Kaya, H., Gürpınar, F., Salah, A. A.: Video-based emotion recognition in the wild using deep transfer learning and score fusion. Image and Vision Computing (65), 66-75 (2017).
Beltrán Prieto, L. A., Komínkova-Oplatková, Z.: A performance comparison of two emotion-recognition implementations using OpenCV and Cognitive Services API. MATEC Web of Conferences (125), 1-5 (2017)
Langner, O., Dotsch, R., Bijlstra G., Wigboldus, D. H. J., Hawk, S. T., van Knippenberg, A: Presentation and validation of the Radboud Faces Database. Cognition & Emotion 24 (8), 1377-1388 (2010)
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.