Skip to content

ciol-researchlab/Multilingual-Pre-trained-Model-Fusion-for-Text-based-Emotion-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper Summary: CIOL at SemEval-2025 Task 11

Multilingual Pre-trained Model Fusion for Text-based Emotion Recognition
Md. Iqramul Hoque, Mahfuz Ahmed Anik, Abdur Rahman, Azmine Toushik Wasi

Overview

This study addresses multilingual emotion detection challenges in SemEval-2025 Task 11, focusing on three tracks:

  • Track A: Multi-label emotion classification.
  • Track B: Emotion intensity prediction.
  • Track C: Cross-lingual generalization.

The authors leverage language-specific transformer models to tackle overlapping emotions, intensity quantification, and low-resource language adaptation.

Key Contributions

  • Models Used:
    • Track A: DistilRoBERTa (English), ruBERT (Russian), DehateBERT (Portuguese).
    • Track B: ruBERT (Russian), EmoBERT-CN (Chinese), XLM-Twitter-EmoEs (Spanish).
    • Track C: Multilingual EmotionBERT for cross-lingual transfer.
  • Architecture: Transformer backbone + MLP classifier with language-specific tuning.

Results

  • Track A:
    • Russian achieved the highest F1 (0.848), benefiting from emotion-rich pretraining.
    • Portuguese struggled (F1=0.2773) due to data scarcity.
  • Track B:
    • Russian excelled (F1=0.8594), while Chinese (F1=0.483) faced intensity estimation challenges.
  • Track C:
    • Cross-lingual adaptation underperformed (F1=0.26–0.31), highlighting syntactic divergence in low-resource languages.

Challenges & Insights

  • Key Issues:
    • Overlapping emotions (e.g., anger vs. sadness).
    • Cultural/sarcastic nuances in emotion expression.
    • High computational demands and annotation inconsistencies.
  • Error Analysis:
    • Misclassifications in ambiguous pairs (e.g., Portuguese "anger" confused with "joy").
    • Intensity prediction errors in morphologically complex languages (e.g., Chinese).

Future Directions

  • Improve cross-lingual representations.
  • Address data scarcity via synthetic data or multimodal integration.
  • Enhance cultural and contextual calibration.

Ethical Considerations

  • Risks include bias propagation, privacy concerns, and misuse in surveillance.
  • Emphasizes transparency and culturally aware AI governance.

GitHub Relevance

This work provides a blueprint for multilingual emotion recognition systems, emphasizing transformer-based architectures, language-specific tuning, and ethical AI practices. Code and model configurations (e.g., hidden dimensions, dropout rates) can guide implementations in similar NLP tasks.

View Full Paper | SemEval-2025 Task 11

About

Code for Multilingual-Pre-trained-Model-Fusion-for-Text-based-Emotion-Recognition, SemEval 2025 Workshop.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •