Multilingual Pre-trained Model Fusion for Text-based Emotion Recognition
Md. Iqramul Hoque, Mahfuz Ahmed Anik, Abdur Rahman, Azmine Toushik Wasi
This study addresses multilingual emotion detection challenges in SemEval-2025 Task 11, focusing on three tracks:
- Track A: Multi-label emotion classification.
- Track B: Emotion intensity prediction.
- Track C: Cross-lingual generalization.
The authors leverage language-specific transformer models to tackle overlapping emotions, intensity quantification, and low-resource language adaptation.
- Models Used:
- Track A:
DistilRoBERTa
(English),ruBERT
(Russian),DehateBERT
(Portuguese). - Track B:
ruBERT
(Russian),EmoBERT-CN
(Chinese),XLM-Twitter-EmoEs
(Spanish). - Track C: Multilingual
EmotionBERT
for cross-lingual transfer.
- Track A:
- Architecture: Transformer backbone + MLP classifier with language-specific tuning.
- Track A:
- Russian achieved the highest F1 (0.848), benefiting from emotion-rich pretraining.
- Portuguese struggled (F1=0.2773) due to data scarcity.
- Track B:
- Russian excelled (F1=0.8594), while Chinese (F1=0.483) faced intensity estimation challenges.
- Track C:
- Cross-lingual adaptation underperformed (F1=0.26–0.31), highlighting syntactic divergence in low-resource languages.
- Key Issues:
- Overlapping emotions (e.g., anger vs. sadness).
- Cultural/sarcastic nuances in emotion expression.
- High computational demands and annotation inconsistencies.
- Error Analysis:
- Misclassifications in ambiguous pairs (e.g., Portuguese "anger" confused with "joy").
- Intensity prediction errors in morphologically complex languages (e.g., Chinese).
- Improve cross-lingual representations.
- Address data scarcity via synthetic data or multimodal integration.
- Enhance cultural and contextual calibration.
- Risks include bias propagation, privacy concerns, and misuse in surveillance.
- Emphasizes transparency and culturally aware AI governance.
This work provides a blueprint for multilingual emotion recognition systems, emphasizing transformer-based architectures, language-specific tuning, and ethical AI practices. Code and model configurations (e.g., hidden dimensions, dropout rates) can guide implementations in similar NLP tasks.