Efficient GRU-based Facial Expression Recognition with Adaptive Loss Selection

  • Sri Winarno Universitas Dian Nuswantoro
  • Farrikh Alzami Universitas Dian Nuswantoro
  • Dewi Agustini Santoso Universitas Dian Nuswantoro
  • Muhammad Naufal Universitas Dian Nuswantoro
  • Harun Al Azies Universitas Dian Nuswantoro
  • Rivaldo Mersis Brilianto School of Mechanical Engineering, Pusan National University, Busan, Republic of Korea
  • Kalaiarasi A/P Sonai Muthu Faculty of Information Science and Technoloty, MNA-R1003, Multimedia University, Jalan Ayer Keroh Lama, 75450, Bukit Beruang, Melaka, Malaysia
Keywords: facial expression recognition, computational efficiency, recurrent neural networks, GRU, LSTM, adaptive loss selection, one-vs-all classification, MediaPipe, statistical equivalence testing

Abstract

As real-world deployment of facial expression recognition systems becomes increasingly prevalent, computational efficiency emerges as a critical consideration alongside recognition accuracy. Current research demonstrates pronounced emphasis on accuracy maximization through sophisticated convolutional architectures, yet systematic evaluation of efficiency-performance trade-offs remains insufficient for practical deployment scenarios. This paper addresses this gap through comprehensive analysis of recurrent neural network architectures for facial expression recognition, specifically comparing Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) implementations within a novel one-vs-all classification framework incorporating adaptive loss function selection. A rigorous 2 × 2 × 2 factorial experimental design systematically evaluates architecture (GRU vs LSTM), optimization strategy (Bayesian vs predefined), and loss function complexity (standard vs advanced with auto-selection) across six basic emotions using the CK+ dataset with MediaPipebased facial landmark features. The investigation reveals that GRU architectures achieve statistical performance equivalence with LSTM while demonstrating 25% computational efficiency advantage (relative complexity 0.75 vs 1.0). The proposed adaptive loss selection mechanism automatically selects focal loss for severe class imbalance (ratio > 11.5), weighted binary cross-entropy for moderate imbalance (ratio 3.5-11.5), and standard binary cross-entropy otherwise. System performance achieves 92.7% ± 5.0% overall accuracy, with per-emotion F1-scores exhibiting substantial variability from 0.215 (fear) to 0.967 (surprise). Comprehensive statistical analysis incorporating power analysis and practical equivalence testing demonstrates optimization strategy equivalence across 25% of evaluated metrics, while architectural comparisons reveal non-equivalence despite similar performance levels. The study acknowledges significant limitations including critically small sample size (n=6 per condition), single dataset validation, and theoretical rather than empirical efficiency validation. These findings provide evidence-based guidelines for architecture selection in resource-constrained facial expression recognition applications, with the adaptive loss selection framework representing a significant methodological contribution for addressing class imbalance challenges in emotion recognition systems.
Published
2025-11-18
How to Cite
Winarno, S., Alzami, F., Santoso, D. A., Naufal, M., Azies, H. A., Brilianto, R. M., & Muthu, K. A. S. (2025). Efficient GRU-based Facial Expression Recognition with Adaptive Loss Selection. Statistics, Optimization & Information Computing, 14(6), 3468-3499. https://doi.org/10.19139/soic-2310-5070-3043
Section
Research Articles

Most read articles by the same author(s)