Local and Global Fusion Attention Network for Facial Emotions Recognition

Local and Global Fusion Attention Network for Facial Emotions Recognition

Tags
Deep Learning
Fusion Method
Published
May 2, 2023
Author
Minh-Hai Tran , Tram-Tran Nguyen-Quynh, Nhu-Tai Do, Soo-Hyung Kim
[Paper] [Slide]
The ideas in this paper have been shared with another person and have been published at the 2023 ICIP conference and ASK conference, but I am not listed as an author (Even though it was my idea - proposed 1 and proposed 2). So, if you read a paper or a thesis with similar ideas, please note that I'm not the one who copied them :)

Abstract:

Deep learning methods and attention mechanisms have been incorporated to improve facial emotion recognition, which has recently attracted much attention. The fusion approaches have improved accuracy by combining various types of information. This research proposes a fusion network with self-attention and local attention mechanisms. It uses a multi-layer perceptron network. The network extracts distinguishing characteristics from facial images using pre-trained models on RAF-DB dataset. We outperform the other fusion methods on RAD-DB dataset withim pressive results.

Proposed Method

Summary: We employ a fusion method to combine the final features of two emotion recognition models that have already been trained. The goal of this combination is to minimize the weaknesses of each model while maximizing their strengths. To generate a new feature the same size as the input sizes, we first concatenate the two features and pass them through a Multi-layer Perceptron (MLP). We average the features before passing them through the self attention block and the local attention block, and then back and forth through a completely connected network to classify emotions..
Local and Global attention Network
Local and Global attention Network
Experiments
Fusion Method
Model 1
Model 2
RAF-DB (%)
Late Fusion
Resnet18
Resnet34
86.35%
VGG11
VGG13
86.08%
VGG11
Resnet34
86.08%
Early Fusion
Resnet18
Resnet34
86.66%
VGG13
Renet34
85.49%
VGG11
Resnet34
86.08%
Joint-fusion
Resnet18
Resnet34
86.05%
VGG13
Resnet34
86.63%
VGG11
Resnet34
86.40%
Fusion attention (ours)
Resnet18
Resnet34
90.95%
VGG13
Resnet34
90.92%

Confusion matrix

Our methods using Resnet18 and Resnet34
Our methods using Resnet18 and Resnet34

Reference

Late fusion method using Resnet18 and Resnet34
Late fusion method using Resnet18 and Resnet34