NLPCC'2024, Tutorial Slides [download]
Speaker: Shujian Huang, Wenhao Zhu
Affiliation: School of Computer Science, Nanjing University
- Üstün et al. Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model, arXiv'2024
- Alves et al., Tower: An Open Multilingual Large Language Model for Translation-Related Tasks, arXiv’2024
- Mistral Team, Mistral Large, 2024
- Gemini Team, Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context, arXiv’2024
- OpenAI, Hello GPT-4o, 2024
- Lu et al., LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages, Findings of EMNLP’2024
- Nguyen et al., SeaLLMs - Large Language Models for Southeast Asia, ACL'2024
- Aryabumi et al. Aya 23: Open Weight Releases to Further Multilingual Progress, arXiv'2024
- LLaMA Team, The Llama 3 Herd of Models, arXiv’2024
- Sun et al., FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data, arXiv'2024
- Qwen Team, Qwen2.5: A Party of Foundation Models, 2024
- Ji et al. EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models. arXiv'2024
- Xu et al. X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale. arXiv'2024
- Yue et al. Pangea: A Fully Open Multilingual Multimodal Llm for 39 Languages. arXiv’2024.
- Banon et al., ParaCrawl: Web-Scale Acquisition of Parallel Corpora, ACL’2020
- Conneau et al., Unsupervised Cross-lingual Representation Learning at Scale, ACL'2020
- Schwenk et al., WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia, EACL'2021
- Burchell et al., An Open Dataset and Model for Language Identification, ACL'2023
- Yuan et al., Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation, Findings of ACL’2023
- Kudugunta et al, MADLAD-400: A Multilingual And Document-Level Large Audited Dataset. NeurIPS’2023
- Arefyev et al., HPLT’s First Release of Data and Models, EAMT'2024
- Ji et al., EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models, arXiv’2024
- Li et al., PREALIGN: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment, EMNLP'2024
- He et al., Scaling Laws for Multilingual Language Models. arXiv’2024
- Muennighoff et al., Crosslingual Generalization through Multitask Finetuning, ACL’2023
- Li et al., Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation. arXiv’2023
- Lai et al., Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback. EMNLP’2023
- Chen et al., Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations. arXiv’2023
- Lai et al., LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback, Findings of ACL'2024
- Singh et al. Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning. arXiv, 2024
- Shi et al., Language Models Are Multilingual Chain-Of-Thought Reasoners, ICLR’2023
- Huang et al., Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting, Findings of EMNLP, 2023
- Qin et al., Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages. EMNLP, 2023
- Zhu et al., Question Translation Training for Better Multilingual Reasoning, Findings of ACL’2024
- She et al., MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization, ACL’2024
- Zhang et al., PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning, ACL'2024
- Yoon et al., LangBridge: Multilingual Reasoning Without Multilingual Supervision, ACL’2024
- Zhao et al., LLaMA Beyond English: An Empirical Study on Language Capability Transfer, arXiv’2024
- Geng et al., Why Not Transform Chat Large Language Models to Non-English?, arXiv'2024
- Zhu et al. The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights, arXiv’2024
- Huang et al., MindMerger: Efficient Boosting LLM Reasoning in non-English Languages, NeurIPS’2024
- Zhou et al., MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing, arXiv’2024
- Xu et al., X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale, arXiv’2024
- Shi et al., Language Models Are Multilingual Chain-Of-Thought Reasoners, ICLR’2023
- Qi et al., Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models, EMNLP’2023
- Zhu et al., Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis, Findings of NAACL’2024
- Bhattacharya & Bojar, Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks, arXiv’2023
- Kew et al., Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed? arXiv’2024
- Chen et al., Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca, Findings of EACL'2024
- Kojima et al., On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons, NAACL'2024
- Gao et al., Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly, NAACL’2024
- Wendler et al., Do Llamas Work in English? On the Latent Language of Multilingual Transformers, ACL'2024
- Tang et al., Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models, ACL’2024
- Shaham et al., Multilingual Instruction Tuning With Just a Pinch of Multilinguality, ACL’2024
- Zhao et al., How do Large Language Models Handle Multilingualism? arXiv, arXiv'2024
- Wu et al., Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment, arXiv’2024
- Wang et al., Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs, arXiv'2024
- Chen et al., Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models? arXiv'2024
- Marchisio et al., Understanding and Mitigating Language Confusion in LLMs, arXiv'2024
- Zhu et al., Multilingual Contrastive Decoding via Language-agnostic Layers Skipping, Findings of EMNLP'2024
- Gureja et al., M-REWARDBENCH: Evaluating Reward Models in Multilingual Settings, arXiv’2024